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BOUNDS  AND  RATES  OF  CONVERGENCE  FOR  THE  EXTENDED  COMPOUND 
ESTIMATION  PROBLEM  IN  THE  SEQUENCE  CASE 

Introduction  and  Summary. 


A.   The  problem. 

Let  6  =   (G  }    Q  ,    . . .  ,  6  t    ...  )   be  a  countably  infinite  vector 
whose  components  Q,      are  elements  of  some  finite  interval  ft   of  the 
real  line.   Let  3  =  {pA°)°.    Ge   ft}  be  for  some  measure  u  a  family  of 

o 

known  probability  density  functions  with  parameter  6,      Let  X.   be  a 
real  valued  random  variable  with  density  p  ( ° ) •   Suppose  the  vector  Q 

u  .  ' — 

1 

is  unknown  and  for  each  i   it  is  desired  to  estimate   0..   The  estimates 

l 

are  to  be  made  in  sequence  and  the  estimate  of  6.      may  be  based  on  the 

independent  observations  X.   j  =  1,  ...  ,    i.   Thus  for  each 

J 

i  =  1,  2,  ...  ,    a  non  randomized  estimator  cp.(X.)   is  sought  for   0., 

where  X.   is  the  vector  of  observations   (Xn ,  ....  X.).   It  is  assumed 

—  l  l7     '   l 

that  at  each  stage  of  this  estimation  problem  one  suffers  squared  error 

/       \2 
loss,  so  that  if  cp.   is  the  estimate  of  6.      a  loss  of  (cp.  -  0.) 

units  is  suffered.   The  risk  of  an  estimator  cp.   is  defined  to  be  the 

expected  loss,  that  is  E[(cp.(X.)  -  G .)  ].   The  average  risk  for  n 

1  n   X  -2 

estimations  becomes  —  Y    E[(cp.(X.)  -  0.)  ].   One  would  like  to  find, 

n  .^     Ti  — i     l 

i=l 

for  specified  ft   and  £s  ,  a  decision  procedure  cp_  =  (cp  ,  cp  ,  . ..  )   which, 
on  the  basis  of  its  average  risk  for  the  first  n  estimations,  is  in 
some  sense  optimal  for  large  n. 


One  "way  in  which  such  a  problem  could  arise  is  as  follows^   suppose 
the  Navy  wishes  to  screen  all  new  recruits  and  to  classify  them  on  the 
basis  of  their  "natural  aptitudes"  to  be  radar  technicians.   In  an 
attempt  to  do  this,  each  recruit  is  given  a  test  whose  outcome  can  be 
represented  as  a  number.   Suppose  also  that  "natural  aptitude"  can  be 
represented  on  a  numerical  scale.   On  the  basis  of  prolonged  testing  and 
evaluation  in  the  past,  the  Navy  has  been  able  to  fit  a  good  probability 
distribution  model  for  the  outcome  of  a  person's  test  score  given  his 
"true"  aptitude  as  a  parameter.   The  Navy  now  wants  to  estimate  each  new 
recruit's  aptitude  on  the  basis  of  his  test  score.   While  squared  error 
loss  is  somewhat  artificial,  it  is  clear  that  the  more  the  Navy  errs  in 
estimating  a  recruit's  aptitude  the  greater  the  loss  it  suffers,  and 
squared  error  loss  is  a  convenient  way  to  represent  this.   In  this 
example  it  Is  also  apparent  that  many  decisions  will  be  made,  and  from 
the  Navy's  point  of  view,  the  average  risk  incurred  is  a  reasonable  basis 

upon  which  to  judge  the  "optimality"  of  a  decision  procedure.   In  this 

th 
example,  then,  0.      would  be  the   i    recruit's  true  aptitude  and  X. 

would  be  his  test  score. 

In  the  preceding  example  it  is  not  unreasonable  to  assume  each 

recruit's  aptitude  is  independent  of  all  other  recruits'  aptitudes.   An 

example  will  now  be  presented  in  which  it  is  not  unreasonable  to  suppose 

the   0.'s  would  occur  in  "patterns."   Suppose  a  Navy  anti-submarine 

group  is  on  patrol  duty  to  guard  against  submarine  penetration.   It  is 

necessary,  in  deciding  what  type  of  patrol  to  carry  out.,  to  have  an 

estimate  of  the  average  sonar  detection  range.   This  range  will  depend 

upon  many  different  factors  such  as  sea  temperature  and  salinity,  as 


well  as  the  sonar  equipment  involved.   Suppose  a  test  is  conducted  every 
few  hours  whose  results  follow  reasonably  well  a  known  probability  dis- 
tribution with  the  true  average  detection  range  as  a  parameter.   In  this 
example  then,   0.   would  be  the  true  detection  range  and  X.   the  test 

result.   One  would  not  expect   0.   and  0.,.,   to  be  unrelated,  however, 

1        l+l  '  ' 

as  the  conditions  fixing  their  value,  while  changing,  are  changing  more 
or  less  continuously  in  time  and  a  high  value  of  0.   would  tend  to  mean 
a  high  value  of  ^  as  well.   In  this  example,  as  in  the  previous  one, 

a  decision  about  the  true  detection  range  will  be  made  many  times,  and 
the  average  risk  is  a  reasonable  criterion  to  use  in  evaluating  a  decision 
procedure. 

B.   Known  results. 

The  problem  of  finding  a  good  estimator  is  really  twofold.   First 
some  standard  of  optimality  must  be  established,  and  secondly  a  procedure 
must  be  found  which  yields  good  results  according  to  this  standard. 

Samuel  [11]  has  considered  the  following  standard.   Fix  0  .   Let  G  (') 

— n         n 

be  the  empirical  distribution  function  of  9    .      That  is 
^  — n 

G  (x)  =  —   (the  number  of  i   such  that  0.  <  x)  . 
n      n  l  — 


Let   (0  ;  i  =  1,  ...  ,  n}   be  mutually  independent  identical];/  distributed 

random  variables  with  a  priori  distribution  function  G  .  If   we  now  con- 

n 

sider  X.   to  be  an  observation  of  a  random  variable  with  the  conditional 
l 

density  function  pQ   given  that  0  =  0  ,  then  the  usual  Bayes  argument 
gives  cp  (X  )  =  E[0  |X.]   as  the  estimator  achieving  the  minimum  Bayes  risk 
R(G  ).   Of  course  this  procedure  does  not  apply  to  the  compound  estimation 


problem  since  G   is  unknown  and  in  any  case  the  6.      are  not  obser- 
vations of  random  variables.  Nevertheless  Samuel  has  shown  R(G  )   is 

n 

an  "optimal"  standard  to  use  in  evaluating  a  procedure  (£     in  the 

following  sense:   Let  R.(<p,  0)  denote  the  average  risk  for  the  first 

n  decisions  incurred  by  a  decision  procedure  _cp  against  a  parameter 

vector  6.      Then  R(G  )   is  an  "optimal"  standard  in  that  if  one  considers 
—  v  n'  ^ 

only  the  class  of  "obvious"  procedures   (cp  :   cp.(X.)  =  cp(X.) 

i  =  1,  ...  ,  n}   then  y  n  R  (_cp,  9)    >  R(G  ).    In  other  words  if  one 

bases  his  decision  about  Q.      only  on  the  observation  having  G.      as  a 

l  D   l 

parameter  and  uses  the  same  rule  for  each  i,  one  can  never  achieve  a 
lower  average  risk  than  the  number  R(G  ). 

Samuel  also  gives  several  sufficient  conditions  on  Q,    (p~:  OeQ.  ). 

u 

and  ^)  which  ensure  that  for  each  fixed  0 


lim  (RJ9,  e)  -  R(Gn))  <  0 

n-»  °° 


and  in  several  cases  she  exhibits  specific  procedures  which  satisfy  the 
above  condition. 

Robbins  (  6 ]  [  7 ]  [  8 ]  and  Johns  [  3  ]  have  done  work  in  the  related 
empirical  Bayes  problem  (see  Chapter  III,  Section  G)  and  many  of  the 
decision  procedures  they  derive  are  also  "optimal"  in  the  compound  deci- 
sion problem.   Extensions  of  their  estimators  will  be  used  in  later 
sections. 

C.    Summary  of  new  results. 

As  mentioned  in  Section  B  it  is  first  necessary  to  establish  a 
reasonable  standard  of  "optimal it y"  to  use  in  evaluating  a  particular 


decision  procedure.  Many  reasons  have  been  advanced  in  the  literature 

for  considering  the  risk  E[(cp.  -  0.)  ]   to  he  a  good  indication  of  how 

well  a  particular  decision  rule  does.   In  the  compound  decision  problem,- 

it  seems  even  more  reasonable  to  consider  the  average  risk  R  (<p,  0)   as 

n  ■*  — 

a  reliable  index  to  be  used  in  evaluating  a  particular  decision  procedure 

($>,    and  this  is  the  index  adopted  in  this  paper.   A  standard  R(0  )   is 

now  needed  such  that  if  for  all  0  and  n  R(cp  ,  0  )   is  no  greater  than 

—  ^n'  — n        ° 

R(0  ),  one  would  be  willing  to  say  (£     is  a  good  decision  procedure. 
Samuel  has  given  good  intuitive  reasons  for  selecting  R(£  )  =  R(G  ),  and 
has  made  the  statement  [ll]  that  R(G  )   cannot,  in  the  limit,  be  improved 
upon.  Based  on  an  idea  of  Johns  [5],   a  sequence  of  more  stringent  stand- 
ards i\.(2.   )  •  k  =  1>  2>    •••  }  will,  however,  be  obtained  in  this  paper  such 
that  R  (0  )  =  R(G  );  and  for  any  fixed  k  =  1,  2,  ...   and  for  all  9, 

V-rP  =  Rk+l(-n)  +  f(k'  n>   -)  +  h(k^  n^  -)   Where  f(k^  n>  &  -   °  and 
h(k,  n,  0)  =  0(— )   uniformly  in  0.   In  addition  for  "most"   0, 

f(k,  n,  0)   is  in  fact  strictly  positive.   R  (_0  )   will  be  shown  to  be 
the  minimum  Bayes  risk  possible  if  in  fact  _0   is  a  realization  of  an 
n  dimensional  random  vector  whose  last  k  components  are  independently 
distributed  from  the  first  n  -  k  components  according  to  the  k 
dimensional  empirical  distribution  function  generated  by  9   .      The  ex- 
tended compound  estimation  problem  is  defined  to  be  the  problem  of  finding 
procedures  which  asymptotically  achieve  these  standards.   The  analogous 
problem  in  the  empirical  Bayes  case  is  being  considered  by  Barndorff- 
Nielsen  [l].   To  make  these  statements  more  explicit  several  definitions 
are  needed.   These  definitions  will  be  used  throughout  the  paper. 


Def .  l)   Let  Q,      be  a  bounded  interval  of  the  real  line.   Let 

S  =  {p  :  9e  n }   be  a  family  of  probability  density  functions  with  respect 

to  some  measure  u.   Let  [9    :  e.efi  j  =  1,  2,    3,    •  ••  )   be  an  arbitrary 

J   J 

sequence.   Let   (X.:  J  =  1,  2,  ...  )   be  a  sequence  of  mutually  tnde- 

pendent  real  valued  random  variables  with  X.   distributed  according  to 

J 

p   .   Let  X,  =  (X.,  ...  ,   X  ).   Let  9  =   ( 9  ,    9  ,    . . .  ,  0   ,    . . .    )     where 

O.eQ,     1  =  1,  2,  ...  .   Let  9       be  the  vector  consisting  of  the  first 
i        '   '  — n 

n  components  of  9. 

Def.  2)  V  ®>    n>   V  k  =  1^  ••  •  >    n  'the  k    order  empirical  dis- 
tribution function  of  9       is: 

— n 

Gn(yl'  y2>  —  '  yk}  =  n-  k+  1 

(#  of  j   (k  <  j  <  n)   such  that:   ^.k+i  <  Y£     i  =  1,  2,  . . .  ,  k) 

When  k  =  1  this  definition  yields  the  usual  empirical  distribution 

function. 

Let  k  and  m  be  fixed  arbitrary  positive  integers  k  <  m.   Let 

(0.:  i  =  1,  ...  ,  m)   be  a  sequence  of  random  variables  with  range  space 

fi.   Let  6  .  , ., ,  ....  0   have  an  a  priori  joint  distribution  function 
m-k+1      '   m 

G  and  assume  the  remaining  0   are  distributed  independently  of  0  . 

Let   (X  :  i  =  1,  ...  ,   m]  be  a  sequence  of  random  variables  with 

conditional  density  functions  pff   given  0  =  9        such  that  the  X 

i 
are  mutually  conditionally  independent  given  the  0..  For  estimating 

the  realization  9        of  0   it  is  well  known  that  the  estimate 
m       m 

E[0  |X  ,  ,n.  ...  ,  X  ],  which  depends  only  on  the  last  k  observa- 
m1  m-k+1'     '  m  ' 

tions,  is  a  Bayes  estimate  and  achieves  the  Bayes  risk  R(G). 


Def .  3)  V  J9>  n>  V  k  =  1,  2,  .. .  ,  n 

let  R,  ( 6   )  =  R(G"  )   where  G   is  the  k'   order  empirical  distribution 
k  — n      n         n 

function  of  Q   .  Thus  R  (G   )   is  the  Bayes  risk  for  G  . 

Using  the  above  definitions  it  will  he  shown  in  theorem  l)  that 


f(k,  n,  e)  =  E([E(ek+1|xk+1)  -  E(ek+1|x2,  ...  ,  x^)]2} 

k+1 
where  0..  ,  . .  .  ,  0,  , -,   have  the  a  priori  joint  distribution  function  G 
1'  '   k+1  n 

It  is  clear  that  f.   is  always  non  negative  and  will  equal  zero  only  if 

E[0k+i'4+i]  =  E[ek+i'x2'  °"  >  \+i]    vith  ProbaMllty  one»    Thls 

condition  is  clearly  satisfied  for  most  3   only  if  Q       generates  an 

k+1 
empirical  distribution  function  G     such  that  0,   and  0.  ._   are 

n  1        k+1 

independently  distributed.   It  is  not  unreasonable  to  suppose  that  "few" 
arbitrary  sequences,  occurring  in  situations  leading  to  the  compound 
decision  problem,  will  satisfy  this  condition,  even  as  n  approaches  «. 
Another  necessary  condition  for  E[0    \E\r+i  ^    =  ^\+i   lXp>  °°°    >    Xv+i -^ 
is  that  the  sample  serial  correlation  coefficient  lag  k  +  1  of 
[Q.i      i  =  1,  •••  ,  n]   be  zero.   Again  it  seems  unlikely  that  many 
sequences  of  Q.      would  have  this  property,  especially  for  small  values 
of  k.   In  particular  if  Q     has  repeated  "patterns"  of  length  greater 
than  k,  neither  of  these  conditions  would  be  expected  to  hold,. 
Accepting  R  (0  )   as  a  standard  to  be  used  in  evaluating  a  decision 
procedure  cp,  attention  is  turned  to  constructing  procedures  for  specific 
classes  3,  and  to  evaluating  these  procedures. 


Def.  k)      Let   fi^,  0)  -  R^,  0)  -  \(  6^ .   Thus  fij^,  0) 

represents  the  difference  after  n  decisions  between  the  average  risk 

th 
attained  by  a  particular  decision  procedure  and  the  k    standard. 

For  many  important  classes  3,  including  the  normal,  gamma,  a 

discrete  exponential  family,  and  a  "non- parametric"  class,  decision 

procedures  (£       "will  be  found  and  an  upper  bound  B(k,  n)  will  be  given 

such  that  £k(^k,  0)  <  B(k,  n)   for  all  0e  ft°°  and  such  that 

lim  B(k,  n)  =  0.   For  the  discrete  exponential  family,  which  includes 

n->oo 

the  geometric,  negative  binomial,  and  Poisson  families,  it  will  be 

shown  that  B(k,  n)  =  0  • — ^-r, —   .  These  results  represent  a  considerable 


n 


w 


improvement  over  those  obtained  by  Samuel  fll]  who  considered  only  the 
case  k  =  1  and  showed 


lim  \R   (cp,  0)  -  R.(0  )]  <  0 
•  n-.x->  -'  1  -n   - 

n->°° 


for  any  fixed  0   in  a  parameter  space  more  restricted  than  that  con- 
sidered in  this  paper.   If  3   is  the  class  of  binomial  probability 


density  functions,  a  decision  procedure  is  obtained  which  attains  a 

log  nl 


i 

lower  average  risk  than  previously  known  procedures,  and  0 


»  n  ' 


is 


obtained  as  the  rate  of  convergence  of  this  risk  to  its  "standard." 


Tic,     Preliminary  Results* 

In  this  chapter  we  shall  first  prove  that  R  (0  )  =  R   ( 0  )  + 
f(k,  n,  6)   +   h(k,  n,  9)     where  f  and  h  have  the  properties  stated 
in  Chapter  I,  We  shall  then  develop  a  general  theorem  and  corollary 
which  will  enable  us  to  obtain  specific  decision  procedures  in  Chapter  III 
Finally  we  shall  prove  several  lemmas  which  will  be  useful  in  Chapter  HI. 

Defo  5)   Let  y  =  (y_  ,  yoi,  . .  <,  ,  y  )   be  an  arbitrary  vector . 

For  k  <  n  we  define  y  =  (y  ,  .,»  y  ,  ._••»«•  y  ) « 

-  -^n   wn-k+l^  ^n-k+2*     '  Jn/ 

Defo  6)  \/^& }    Q,  \f  \.f   n  such  that  1  <  k  <  n  let 


Q'n(^k}  =  n  -  k  +  1  ^  0i  II  P0     ( V 
n  k    n   ^   X  j=k  J  /=1  yj-k+i 


While  both  Q  and  Q*  are  functions  of  several  variables  which  are 

not  explicit  in  the  notation,  it  will  always  be  clear  in  context  what 

arguments  are  intended-  We  note  that  Q  (x,  )   is  the  unconditional 

density  function  of  a  random  vector  X   if  ^e   parameters  0  s    <> . .  ,  0 

are  assumed  to  be  random  variables  8  ,  <►..  }    8   with  a  priori  distri- 

but ion  function  G  <> 

n 


Def.  7)  V  %  >    0,    V  k,  n  such  that  i  <  k  <  n,  let 


k 


E  e,  n 


♦fek> 


P0     (x.) 
j=k  J  i=l   J-k+l 

z    n  pe    (^) 

j=k    i=l   j-k+i 


if  Q  >  0 
n 


otherwise 


Let  m  and  n  be  integers  such  that  1  <  k  <  m,  and  n  <  °°,   then 

^(X^   is  one  version  of  E[0  Ix1*],  and  R  (0  )  =  E[(9  -  \|rk(Xk)2] 
n  — m  m'-m  '       k— n        m    n  — m 

=  E[ef  -  (^(x*))2]. 


m     n  — m 


Def.  8)  V  k,  J   such  that  1  <  k  <  j   let 


F.[q>,  £]    =  E[(q>rf)  -  0,)2] 


where  cp(  • )   is  an  arbitrary  non  randomized  estimator  with  a 
k-dimensional  argument. 


Vn>k  let  R(cp,  <9n)  =  n  _  ^  +  L  E  Fj[q>,  0.]    . 
R(cp,  G   )   is  then  the  Bayes  risk  in  using  the  rule  <p(X  )   as  an  estimate 

■"  _n  —  K 

of  6.   when  9..,  ...  ,    9   have  the  a  priori  distribution  G  . 
k        1        k  ,l 

We  now  compare  R^j^)   witn  ^+1^)   and  Prove : 

Theorem  l)  V  ^  ,  0,  k,  n  1  <  k  <  n  < «  \(^n)  =  \+i^rl)  (  f(k'  n>  .§)  + 
h(k,  n,  0)   where   f(k,  n,  0)  >  0  and   |h(k,  n,  0)  |  =  C)j-j  uniformly 


m 


10 


Proof : 

Let  E  [•]   refer  to  expectation  with  respect  to  G   and  E  [*] 
refer  to  expectation  with  respect  to  G   .   For  any  estimator  cp(}C   ) 

r(<p(W'  4>  ■  V^W  -  °w2] 

=E2[8k+l]    "   E2[E2(WW] 


+  V'VWW  -  ^W1'' 


which  of  course  is  minimized  for  <p(3C.   )  -  E  [0    |X   ]  .   Letting 
^W  •   Wi&i1  We  0btaln 

\(en)  -  \+1(eJ  =  Rk(en)  -  RCVVi&i1'  V 

+  E2([E2<ek+il4+i'-E2<\+ilO]?' 


Let         f(k,  n,  e)  =  E2{'[E2(ek+ilW  -  E2(ek+1|)^+1)]2) 


h(k,  n,  e)  =  y«n>  -  R(E2[ek+1l4+1],  en) 

Then  it  remains  only  to  show   |h(k,  n,  0)  |  =  0-j  uniformly  in  6.      But 


since 


2-,    „  r^2r^  i„  n  and 


W  =  Ei[ek]  -  Ei(Ei[ekM 

R(E2[ek+1|xk+1h  _V  -  E2[ek+i]  "  VE2[eW&i]) 
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and  since       Q,      is  a  "bounded  interval,    say     9e  Q  =*>  \e\  <  B  <  »     we  have 


|h(k,    B,    8)|  <    |El(^)    -   E2(^+1)|    ♦    lE^E^eJ^])    -  V^+ll&i'> 


and 


lEi<ek> 


V<+1> 


1  n     ?        -I        n 

n  -  k  +  1  M      .1       n-k.e 


J=k 


J=k+1 


(n  -  k  +  l)(n  -  k) 


I       62  +   (n  -  K)92 
j=k+l     J  K 


< 


B 


n  -  k  +  1 


Also 


Iv* 


I^^I^J'-V^+ilS+ii'l 


n 


I    0i     li     Pe  *-i 

j=k     J  1=1       j-k+l 


z  ft 


(x«)  \       I     11    Pe  (xi) 

j=k  1=1       j-k+l 


I  11  p, 

j=k  i=l       J-k+i 


(x,) 


/ 


n  -   k  +  1 


n  k 

J=k+1     J  1=1       j-k+l 


\  n         k 

\    i  n 


n 


J=k+1  i=l       j-k+i 


PQ  (x^) 


J=k+1  1=1       J-k+l 
n  -  k 


>    nCcb^) 


Fo 


r  fixed  x,   ^^e  expression  inside  the  braces  may  be  written  as 


/a  +  a\   b  +  b 
n    1    n 


/a  \   b 
n|    n 


b  +b|  n-k  +  1 
n 


b  /  n  -  k 
\    n/ 
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xi  is. 

where        a     =       Y       9  . 


P0  (x.) 

j=k+l     J   i=l       j-k+i 


k 
a  =  ek  J    ^  (x,) 


n         k 


bn=       I        II     P6  <*J> 

j=k+l  i=l       j-k+i 


b4     P0   ^xi) 
i=l       i 


and 


/a     +  a\       "b     +  b 

n  n 


a    \ 

n 


b     +  b       n-k  +  1 
i    n  ' 


b 


n  -  k 


(n  -  k)b   (a     +  a)2  -    (n  -  k  +  l)(b     +  b)a2 

n     n  '      n  n 

(n  -  k)(n  -  k  +  l)fb     +  b)b 
'     n  n 


O  O  O  O 

(n  -   k)(2b  a  a     +ba     -ba)-ba     -ba 
_ n       n  n n  n  n n 

(n  -   k)(n  -   k  +  l)(b     +  b)b 
\  i  \  i  \    n  n 


-n-k  +  1 


2a  a 


n  a 


b     +  b!  b     +  b 

n  I    n 


b  a 


(b     +  bib 
n  n 


+ 


(n  -  k)(n  -   k  +  1) 


2  v     2 

a         i  i        b  a 

n  n 

b     +  b  7b     +  bVb 

n  In  *    n 


From  the  definition  of     a   ,      b   ,      a,    and     b      it   is   clear  that 

n'        n'        ' 

a 

|t-^I<B        Ir  I   <  B       b>0       b>0        so  that 
'b    '   —  'b '  —  n  —  — 

n 


1 2a  a 


n 


b     +  b 

I    n 


< 


i2a  a 
i 

b 


=   2 


b     <  2B  b 
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b     +  b 
n 


< 


a  b 


2 
<  B  b 


b  a 


(b     +  b)(b   ) 
n  n 


< 

ba2 

n 

b2 

n 

2 
<  B  b 


a 


b     +  b 
n 


la  b 

n 


2 

<  B  b 


Thus  we  have 


lE^t^n-  E2(E2[ek+1|^+1]) 


B   (b     +  b) 


^   J    (n-   k  +  1  +   (n-   k)(n-   k  +   l)]^^ 


1*B< 


n  -   k  +  1 


/  ( i  Vx+(d-^ 


R 


(n  -   k)(n  -   k  +  1)  J      I  £ 


p^(Xi)|^(dxk) 


n  -   k  +   1 


E        11     P0  UJ 

j=k+l  i=l       j-k+i 

n  -    k 


^(d^) 


R 


kB2  B2  .  B2 

n  -  k  +  1  (n  -  k)(n  -  k  +  1)        n  -  k  +  1    * 
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Hence  |h(k,  n,  0)1  < 7.B  ^   _  . 

''  '—'—  n-k  +  1 

Q.E.D. 

The  implications  of  theorem  l)  were  discussed  in  Chapter  I.  We  now 
state  and  prove  a  generalized  form  of  a  lemma  of  Samuel  [ll]. 

Lemma  l)  y  6,   k  >  1,    n  >  ~k 

i— -  V    Efdftx*)  -  Q.f]   <KO   ) 

n-k  +  1  ,*-i.  Ti  — i     i    —  k  — n 

i=k 

k   k 
Proof:  Fix  0,  k,  and  n.   Using  the  expressions     F ,(\|r.,  0.)   and 

j      J-        j 

R(\|r   ,    0   )      given  in  definition  8),    and  observing  that   R(^.+-,>    0.)  .^"^(jO 
we  have : 


l=k 

i=k    ^  ^ 

=  ^~TT     E  (i-k  +  l)[R(lrk,    (9.)-  R(**         0.)]  +  RUk,    0   ) 

n-k  +  1    .,  i'  — i  i+l     —l  rr   — n 


<  R(^k,    9   )    =  R    (0   ). 
—        rn     — n  k  — n 


Q.E.D. 
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Def.    9)      A  decision  procedure     cp     =   (cp  (X   ),    q£(X   ) ,    . ..    , 

k  '  t  "h 

cp  (X   ) ,    ...    )      is  asymptotically  optimal  of     k         order  if 


V  0  lim 

n->  °° 


i   l  e[(<p5  -  of]  -  y_en) 


i=l 


<  0 


If  lim  jsup  ±     t    E[(cp^  -  0.)2]  -  Rk(0n)  Y  <  0     then  /   is 


n-»  °°  ^  y     1=1 


uniformly  asymptotically  optimal  of  k    order. 


We  shall  now  state  and  prove  a  general  theorem,  which  with  its  corollary 
will  enable  us  to  obtain  the  results  of  Chapter  III. 

Theorem  2)  V  bounded  interval  ft  =  [a,  p],  family  of  densities  % ,    and 

integer  k  >  1,  let  _cp_   be  a  decision  procedure  such  that  y  i  >  k 

P|cp,(X.)rft  ]  =  l.   Suppose  there  exist  non  negative  functions  i.(9f    x,  ), 
1   1  i  °~     ~~ K. 

t,.(0,    x,  ) ,    and  a.(_0)   such  that  V  £>   2Sv>   ^  ^  ^ 

a)  P[|q>i(3C±)  -  ^(x^l   >  g1(0,  r^lxj  =  2^1  5  5±(e,  2Ek) 

b)  lim  {i  J  E[|.(0,  J*)  -  5±(e,  ^)|Q±(3$)  >a.(0)] 
n->  °°  ^  i=k 

i=k  J 

uniformly  in  0 

where  the  functions  Q.   and  t|t  .   are  as  given  by  definitions  6)  and 


7) 


Then  the  decision  procedure  _cp_   is  uniformly  asymptotically  optimal 


^.  ,  th 

of  k    order,  and  moreover 
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ex(/,  fl)<(*-  d(p  -°r  ta(L^).  E  E[|.  +  (p  .„)£.!,,.  >a] 

n  •*   —  —       n  n    _.^_  l1  1  —  1 

^  2(p  -  a): 


i=k 

,2  n 


I  PfQi  <  a,] 

i=k 

k  k 
where  £  (_<£  ,    0)    is  given  by  definition  4). 

k  -  k 

Proof:   Clearly  it  is  sufficient  to  prove  the  upper  bound  for  &  {<£  ,    9) 

is  correct. 

We  first  represent  £   as  a  sum  of  several  functions,  and  then  examine 

n  ' 

each  of  these  functions.   We  shall  consider  k  and  n  fixed. 
Let: 


k-1 

V/>  2)  =i    I    ns[(q£  -  0.)2]  -  y©)]} 

i=l 


h2(/,  e)  =  £  I  {E[(cpk  -  e^2]  -  e[(i£  -  ^i)2]} 

i=k 


H  (cpk,  0)  =  -  I  {E[(^k  -  0.)2]  -  R  (0  )} 
i=k 


For  the  remainder  of  the  proof  we  delete  the  superscript  k.   Clearly 

£  (o,  0)  =-i  J  E[(cp.  -  0.)2]  -  R.  (9  ) 
1=1 

=  H1(^  0)  +  H2(?,  0)  +  H  (_$,  9)    . 

Let  B  =  (0  -  a).   Then  E[(cp.  -  9.)2]   <   B2  and  FL  (0  )  <  B2;  hence 

IL  (<p,  0)  < —    .   It  follows  immediately  from  Lemma  l)  that 

H  ((£,    9)   <   0.   It  remains  only  to  examine  H  (eg,  0). 
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!       n 

H2(^  £)  =  -   I  e[(v±  -  ^X^  +  *±  -  20.)] 


i=k 


2B     n 


<^     J    E[|cp.    -  ir.|] 
i=k 

<f  J  min  [E[||   +  BC  |],  B] 
i=k 

i=k  i=k 


The  desired  result  follows  immediately. 

Q.E.D. 
Corollary.   If  condition  b)  of  theorem  2)    is  replaced  "by 

b')  lim  E[£.(0,  x*)  +  £.(e,  xk)]  =  0 
i  — '   — 1     1  — '  — 1 
1— >  °° 

uniformly  in  6 

k  t  h 

then  cp   is  uniformly  asymptotically  optimal  in  the  k    order. 

Proof:   From  the  proof  of  theorem  2)  it  is  enough  to  show 

lim  {sup  H  (cp  ,  G)}-  <  0)    recalling  that  H p(_cp,  &)      is  a  function  of  n. 
n->  o°   6 
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2B  n 


But  H2(jp,  0)  <~     V    E[jcp.  -  t.|] 


i=k 


<-f  Z  aa^e,  rf)  +  B^(^)} 


i=k 

-»  0  uniformly  in  _0  as  n  -»  °°    since  uniform  convergence 
implies  uniform  convergence  in  Cesaro  mean. 

Q.E.D. 
We  turn  now  to  several  lemmas  which  will  be  useful  in  Chapter  III. 
lemmas  2),    3),  and  k)   will  be  used  to  establish  condition  a)  of  theorem  2), 
while  lemma  5)  will  be  used  in  evaluating  certain  limits. 

The  first  of  these  is  an  inequality  proved  by  Hoef fding  [  2  ] ,   which 
we  state  here  without  proof. 

Lemma  2)   If  Xn ,  X^,  ...  ,  X   are  independent  and  a  <  X.  <  b 

'  1'      2  n  —      i  — 

for      i-  =  1,    ...    ,    n     then     yt   >0 

2 
-2n[    * 


P[|X  -   E[X]|     >  t]    <  2e        ,b"al 


1      n 
where     X  =  -     V    X.    . 
n    .^      l 
i=l 
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Lemma  3)   Let  X, ,  X„,  . . .  .  X   0  <  X.  <  "b  i  =  1,  . . .  ,  n  be 

'  1/   2      '   n    —   i  — 

a  sequence  of  random  variables  such  that  for  some  k  >  0  and 

V  i  =  i,  2.  ...  ,  k  the  random  variables  X.,  X.M,  X.,01,  ...   are 
v      }      *  '  i'      l+k   i+2k' 

mutually  independent.   Then 


P[|X  -  E[X] |  >  5]  <  2ke 


r  n2  s2 
•2kTnTkT6 


m 


Proof :   Let  S 


i  "  I  x1k+i 
j=0  d 


where  m  is  defined  as  the  integer  such 


that  -  -  1  <  m  <^  and  X,  =  0  for  £  >  n.   Let  y.  =  EfS.].  Let 
k        —  k       £  i    '  l 

A  =  event  Is.  -  r.l  >-r-°   But  S.   is  the  sum  of  m  +  1   independent 
i         '  i   ■  i  '  —  k         i 

random  variables  and  from  lemma  2)  we  have: 


P[A.]  =  P 


m 


5n 


j30  Jk+1    i1  -  k 


=  P 


i    m 

i-|  [  X 


7,1  > 


8n 


m  +  l1  ,f-   jk+i  "'  'i1  -  k(m  +  1) 

J 


■2(m+l) 


*2  2 
o  n 


<  2e 


<  2e 


k  (m-i-1)  b 


2   *2 
n    o 


k(n+k)  b2 
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So  that 


P[|X  -  E[X]|    >  5]  =  P 


II   (si  -r±)|  >on 

[.|iISi  -  7±|    >5n] 

[l  |S.   -7.|    <5n] 


1  -  P 


<  i  -  p   n   a. 


=  p 


rk 


U     A 
li=l 


<    I     PU  1 
i=l 


-2 


<  2ke 


2      *2 
n        5 

k(7?k772 


Q.E.D. 


Lemma  h)      Suppose  for  non  negative  random  variables     X   ,   X 

pI|Xi    "  I1-,  I    >  5J  <  **      i  =  1,    2;     u     >  0,  u      >  0;   and     B/o     is   any 
1111  i  ^x 

number  such  that     -=  <  B  <  °°.      Define     W  =  min[rri,    B'J      (  if     X     =  X     = 
we  take     W  =  0) .     Then 
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"l.    1 


Proof:   We  first  show  |w -|  <  —  (|x  -  u  |  +  B|XQ  -  n  |  ) 


Case  i)   W  <  B.   If  X  =  0  then  X  =  0  and 


l,Tl"^-<Kt,-E(IV'ilt1V^ 


If     X2/0     then 


lw  -  h  '  Ir  -  h  -  xV^xi "  ^1>X2  -  <X2  -  ^>Xll 


2  2  2  2"2 


<]T   (l^-^il    +B|X2-,2|) 


X 


Case   ii)      W  =  B.      Let      Y  =  ~.      Then      Y^O     and      Y>X. 


^1.        „       *L       Xl       ^1 


=  ~   i(\   ~   ^)   +   B(^2   -  Y)] 


H2'  V-2         Y        ^       H2 


<i   (|XX  -  nj    +  B(^   -  X2)   <-   (|XX  -   nj    +  B|X2   -   ,2|) 


From  this  fact  we  have 


^1.    1 


X]_  -  nj  <  &1  and   |X2  -  n2|  <  S2=>  |  W  -  -|  <  --  (o_(  ,  l%) 
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and 


*t|W  -  gl    >±  (\  +  B82)]   .  1  -   P[|¥  -  ^J    <i  (6l  +  B52) 


<  1  -  P^  -  iij    <  ^     and     |X2  -  ng|   <  5£1 
-  PClXj.  -  fj   >8X     or     |X2  -  n2|    >S2] 


i  el  +  £2 


Lemma  5)   Let  F  be  an  absolutely  continuous  distribution 
function  with  corresponding  density  function  f ,   Let 
C„  =  (x:  |f"(x)|  <M  and  f"(x)   is  continuous}.   Let 
D  =  {x:  |f'(x)|  >0}.  Let  ye( -*»,«>). 


i)   If  there  exist  M  <  °°  e  >  0  such  that :  ye.D  and 

'  o 

{x:  y  <  x  <  y  +  e  Ice.,  then  V  e  0  <  e  <  e 
1   ^  —   —  ^    o     M  —  o 


y+e 
■i  J     f(x)dx  -  f(y  +  pe)  with   |p  -  ||  <  5\f?(y)\    ( 


ii)   If  there  exist   M  <  °°  e   >  0  such  l.hat :   yeD  and 
{x:  y  -  e  <  x  <  y}  c  n   then  V  f:  0  <  e  <  e  . 


■^  J   f(x)dx  =  f(y  -  Pe)  with  |p  -  ||  <  3^?^) 
y-e 


Q.E.D. 
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Proof:   Using  the  mean  value  theorem  and  the  Taylor  expansion  we  have 
in  case  i) 


y+€ 

i   /  f(x)dx  =  f(y  +  pe) 


f(y)  +  f,(y)P€+^^(pe); 


where  0  <  p  <  1  and  y  <  x*  <  y  +  € 


But; 


y+€ 
i  /  f(x)dx-i(F(y+  e)   -  F(y)) 


ilfiy)e  +  rM£2  +  fJlif£ 


where     y  <  x'   <  y  +  e 


Thus 


f  .(y)p€  +  f'(^)(p02  .  £lkk  +  £lp£ 


(p  -|)f(y)  = 


ff"(x')        f"(x*)  v2 

6  2 


|p--5l|f'(y)l  < 


'|££1|  +  |ri^)|p2' 


1,  1  M        M 

21    ^    |f(y)|U        2 


e   = 


2Me 


■JFTyT 


The  proof  in  case  ii)  is  similar. 


Q.E.D. 
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III.   Main  results. 

We  now  turn  to  the  task  of  finding  asymptotically  optimal  procedures, 
bounds,  and  rates  of  convergence  for  specific  classes  of  distributions. 
We  shall  also  look  at  a  modification  of  our  problem  in  a  very  general 
class  of  distributions.   Finally  we  shall  consider  the  "empirical  Bayes" 
problem. 

The  notation  we  shall  develop  and  use  is  inherently  cumbersome;  to 
ease  its  burden  somewhat  we  shall  not  always  indicate  all  possible 
dependencies  and  shall  not  always  indicate  one  or  more  of  the  arguments 
of  a  function.   Hopefully  no  misunderstanding  will  arise  because  of  this 
practice . 

A.    A  special  discrete  class  of  distributions. 

We  first  consider  a  special  discrete  class  of  distributions  defined 
on  the  non  negative  integers  as  follows: 

P[X  =  x|0]  =  p_(x)  =  9Xh(0)g(x)     x  -  0,  1,  2,  ... 


where:     i)  9z  Q    =  [0,  P]   0  <  p  <  «  h(P)  >  0; 

[p.  /'  y  \ 
(X  f  1 

iii)   There  exists  M1  <  °°  such  that  y  *■>    J  =  °>  ^  2)    ■ 
such  that  g(i)  g(j)  4   ° 


M* 
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iv)   If  P  <  1  then  there  exists  a  constant  b  such  that 
g(x)  <  x   for  all  hut  finitely  many  integers  x. 
If  P  >  1  then  there  exists  a  constant  h  such  that 
s(x)  <  '  y  v  for  all  hut  finitely  many  integers  x. 


x 


All  of  these  restrictions  are  quite  mild.  The  third  prevents  g  from 
oscillating  wildly  as  its  argument  progresses  through  the  integers.  The 
second  and  fourth  conditions  restrict  slightly  the  rate  at  which 
9   g(x)  ->  0  as  x  -»«>. 

Examples  of  such  a  class  are: 


Type 

6 

h(0) 

g(x) 

Poisson 

[o,  p],p  <«, 

-G 

e 

1 
xl 

geometric 

[0,  P],p  <  1 

(1  -  B) 

1 

negative 
binomial 

[o,  p],p  <  1 

(l  -  e)a,  a  >0 

a  assumed  known 

('*:") 

The  conditions  are  easily  seen  to  be  satisfied. 

V 

Recalling  that  x.  =  (x  ,    ,  ...  ,x.)   j=k,  k  +  1,  ... 

J       J—K.TX  J 


we  define : 


W  " 


1  lf  4  =  ^ 

0  otherwise 


J  =  k,  k  +  l, 


W  =  i-k  +  1  £  W 


i  *  k,  k  +  1, 
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rg(xk)Pi(*l>  *2>    •••  >    \-l>    *k  +  ^ 


W  -  < 


iC^  +  iJPjCXk) 


if  g^  +  l)P±(xk)  > 


otherwise 


i  =  k,  k  +  1, 


[1 
2 


if  i  =  1,  . . .  ,  k  -  1 

^(Xj)  !=  {  PJ(2$)  if  °  1^(2^)  <P  i  -  kf  k  +  l,  ...  , 

P      if  P  <  P*(xJ)      i  =  k,  k  +  1,  ...  , 


Theorem  3)   For  the  problem  defined  in  this  section 

k   /  k   k      v  "t"  h 

JE  m   [ty-if   Vo>    •••  )   is  uniformly  asymptotically  optimal  cf  k 

order  for   k  m  1,   2,    •  ••  ,  and  fi£(^k,e)  <  B(k,  n)  =  OJ10^^  . 

\   n  '  / 

Proof:   We  shall  show  the  conditions  of  Theorem  2)  are  satisfied. 

Fix  k.  Fix  6. 


i   k 


Recall 


.1=k  i=l 


J-k+4 


We  observe  that  there  exists  a  set  R.   in  k  dimensional  spac<  such 

l 

that     PfxfeR.]   =  1     and     £J£R±  =>  S^^iC^)    >  0#      We   then  have 
Xjc€R1  =>  g(^  +  l)Q  (x.)    >  0.      VxkeR1      we  know  from  definition    7) 
that     y  i  >  k: 
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t?(*v)  ■■*±-*-*± 


?  ej  i  ei+i"<ej-k+i)8(^) 


A 1  ^-w^1 

g(xk)Q1(x1,  xg,  ...  ,  y1,  ^  +  1) 


We  now  consider  the  Y  '  s.   Since  the  X  '  s  j  *  1,  . ...  ,    i  are 

J  J 

independent  it  is  clear  that  Vx,  )  i'=  1,  2,  ..=  ,   k,  the  random  variables 
Y,(x  )^  Y»   '.x,  )>  Y„    ;x,)>  ...   are  mutually  conditionally  independent 
given  X,  =  x,  .   Also: 


*[*j («,,)  |xj  -^l-p^-  a^-a*] 


I  Pe     (x.)     J  =  k,  ...  ,  i  -  k 
=1  0J-k+i  x 


0  <  EtY^x^)  |x*  =  x^]  <  1     J  =  i  -  k  +  1,  . . .  ,  i 
kg(xk  +  1) 


Now  Vi^eR.  V  ^(x^  gj.)  >  i  .  k  +  i   i  "  2k#  *" 
we  have,  using  Lemma  5) : 
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P[|g(Xj.  +  l)*!^) 


<  P 


«!*(%)  I    +   U-2U1        jk+1  W 


i 

z 


(xj 


j=i-k+l  i=l       j-k+i 


5  (i  -  k  +  l) 


> 


-  g^  +  l)(i  -  2k  +  1) 


iyl^-5 


^k 


<  p 


5.(1  -  k  +  l) 
^ktefc*   "   Qi-k(^k)l   •^g(xk1+  l)(i  -   2k+  I) 


"    (i   -    2k  +   1)15  -  *k 


<*-*{-*&:?;$' 


8.(i  -  k  +  1) 


k 


_g(x^  +  l)(i  -    2k  +  l)    "    (i   -   2k  +   1) 


<  2k  exp  <   —7 


U5.  26    (i   -   k  +  1) 

1  1 


(\  +  1}  k  g2(xk  +   1) 


kgC^) 

By  et  similar  argument     V^    V€.(x,  *    $•)    >  ~ ; — ttt     we  frat<ve: 

PflgC^P^x^    ...    ,    x^=i,   ^  +  1)   -   g(xk)Q.(x_v    ...    ,    xk_1,    xk  +  1)| 


>  e  .  I X 7  =  x.  ]   <  2k  exp 

-      i'-i       — k"   - 


ke 


2ef(i  -  k  +  l)  1 

1  1  L 


uxTl  .       2/    k, 

>v    k7  k  g  (x   ; 
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We  are  now  in  a  posll  Lon  to  apply  lemma  k)    to  obtain  the  functions 
| j  and  £   for  condition  a)  of  theorem  2).   if  we  substitute  in 

lemma 

W  =  cp\(X±) 

^2  =  g(xk  +  [lK^ 

Then  we  have  yi=  2k,  2k  +  1,  ... 


r  2c?(i-k+l)   4e. 
1   —  +   L 


<  2k 


kg  (3^) 


iT^T 


+  e 


2B*(i-k+l) 

kg2(xk+l) 


1+5 


+ 


rA\+i) 


Recall  the  above  inequality  has  been  shown  only  for  2Sve^i  >   ■*•  ^  2k  > 


5,    > 


kg(xv  +  1) 


> 


kg(xk) 


'i-     i-k+l'6i~i    ■  k  +  1 
hold  we   shall  use  the  trivial   inequality 


If  any  of  these   conditions   should  not 


P[|cpk  -  V^l    >  0|xk  =  x   1    <  1 
1  Ti  11    —      '—i       — k     — 
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e.  +M. 


4e.      Al-S . 

£  .  =  2k  exp  -{— 7— ^r  +   /  \^\ 


2         *2 
e  .         5  . 

1 


2(i  -  k  +  l)     i     

k       2,   n    2 


g  (xk)    g  (xk  +  l) 


We  have  now  produced  the  inequalities  for  condition  a)  of  theorem  2), 

We  must  now  choose  functions  5,(x,  .  6.),  e.(x,  ,  0.)   subject  to  the 

1  —  k  —  i'    1  —k  — 1 

conditions  that 


k  gU  +  1) 

6.  <   .   f   .   =*  8.  =  0 
1    1  -  k  +  l       1 


**j 


e.    < ,  "  -,   =»  g.  =  0 

1   1  -  k  +  1       1 


and  show  that  for  Q,   and  some  a.   condition  b)  of  theorem  2)  is 

1  1 

satisfied. 

We  first  prove  the  following  lemma. 

Lemma  6)   For   (p~:  6e   fi]   defined  in  this  section  V  k  =  1,  2,  . . „ 
U 

V  e  0  <  e  <  1  there  exists  a  constant  M  =  M(e,  k)  <  <»   such  that 

.00  / 

V  Se  ft  }   V  n  =  k,  k  +  1,  ...  ,   n  f   1 


n    r  i   k 


1  p 

i=k 


i  n 


Lj=k  i=l   j-k+^ 


(XiW  <  i£ 


^.,  e  .   k 
<  Mn  log  n 
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Proof,  We  first  consider  the  case  k  =  1.   Let  d  be  the  smallest 
integer  such  that  g(d)  ^   0.  V  m  =  cL,  d  +  1,  . . .  ,  V  0e^ 


00 

y  Pfl(x) 

gfmj 


Pn(m) 


x=m 


d-1 


=  i  eyh(9)g(y)  wg(y/vm?  J  +  x  gy  g(y/.m? 

y^d  \h(e)g(y)g(m)|        y^Q  g(m) 

M" 
—  v,7r  'l   where  M"  <  °°   from  condition  iii)  and  the 


easily  verified  fact  that  h  is  a  decreasing  function. 


Also  it  is  clear  that  y  9     there  exists  a  smallest  non  negative  integer 

i 

m.   such  that  Y    p_  (m.)  <  i  . 
i  ,  ,  9.      i 


Let 


Ka,  b)  =  { 


1  if  a  <  b 


0  otherwise 


Then: 


n 

I  p 

i=l 


I  Pe  (x.)  <  i( 


n   oo 


i=l  x=0 


I  P0  (x),  l' 


P0  (x) 

i 


n    oo 


<  I    I  P0  (x) 

i=l  x=m.    i 

i 


M" 


^W)   Z  pe.(mi} 

1=1    l 
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i  i 

Now   £  p  (x)  =  Y,    e^(0 Jg(x)  <  tf^OjgCx)   so  that  if 

x 

P   h(0)g(x)  <  i6"1  then  n^  <  x.    But  P^ojgtx)  >  i6"1 


x  log  P  +  log  g(x)  >  (e  -  1)  log  i  -  log  h(o) 


If  P  <  1  then  we  have  from  condition  iv)  that 


x 


log  p  +  h  log  x  >  (e  -  l)  log  i  -  log  h(o) 


x  < 


1  -  e 

1  L 
loSp 


lo»  i  +  lQg  h(°)  +  b  lQg  x 

i0gl     1-6        1  -  € 


there  exists  M"' <  °°   such  that  x  <  M"1  log  i  for  i  >1 


If  P  >  1  then  we  have  from  condition  iv)  that 


x  log  P  -  (x  -  b)  log  x  >  (e  -  l)  log  i  -  log  h(o) 


=»x(log  x  -  log  P)  <  (l  -  e)   log  i  +  log  h(o)  +  b  log  x 


But  x  >  P  =>  log  x  -  log  p  >  log  P 


so  if  x  >  p  4   1  then  also  x  <  •= sr 

-    T  -   log  P 


lop.  i  +  lo.g  H°)   +   b  l°g  * 
1  -  g     1  -  e 


If  P  =  1  then  if  x  >  1  also  x  < 


1  -  € 


iOK  t  +  lgg  hM'  +  b  lQg  is 

S         1  -  G         1  -  € 


—  log  X 
In  each  case  there  exists    M"'  <  °°   such  that  x  <  M'"  log  i 


for  i  >  1. 
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Let  M.,   be  the  smallest  integer  such  that  M.  >  M"'  log  i  for 

M  1 

I  >  1.   Then  p   h(0)g(M.)  <  1     and  hence  m,  <  M . .   Now  let 

I  ={i:i=l,  2,    ...  ,  n  and  m  =  v}  for  v  =  1,  2,  . . .  ,   M  . 

Let   i   be  the  greatest  integer  in  I  .   Then 


n 


i=l       Lj=l       j  J  i=l       i 


M" 


M 


M" 


v=l  lei         i 


M 


M" 


lvl  V^  /  \  t 

<  ■  ■/'/•>' \  /  (i   )        from  the  definition  of     m. 

-  h(P)  ~  x    v'  i 

VK/  v=l 


M"  M 
.  n     e     .  ,.     e   -. 

<     h/a\    n     <  M  n     log  n      . 


This  completes  the  proof  for  the  case  k  =  1.   We  now  consider  the  case 
k  >  2. 


lp[E   ft 


i=k 


j=k  i^l       j-k+* 


(Xi-W  < 


.e 


m.-l 


|-   n         °°  °°  i 

Li=k  xr°       Vf°  ^=0 


I    11  p0       (xP>  i€ 

Lj=k    i=l      Dj-k+i 


P0  (x^) 

4=:      i-k+i 


n         oo 


00  00 


z    x  ••• 

i=k  x1=0  Vl=0  Vm. 


i       k 


i  n  pe 

Lj=k  i=l       j-k+i 


(xi),    i 


.  t 


5=]      °i-k+i     * 
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where  m.   is  as  defined  for  the  case  k  =  1.  We  shall  now  consider  the 


two  bracketed  parts  of  the  right  hand  side  of  the  last  equation  separately. 
The  second  bracketed  expression  is  clearly  seen  to  be  less  than  or  equal 
to  M*n  log  n  for  some  M*  <  °°  by  the  argument  used  in  the  case  k  =  1. 
The  first  bracketed  expression  can  now  be  broken  up  into  k  expressions, 
k  -  1  of  which  are  less  than  or  equal  to  M*n  log  n  with  the  remaining 
expression  being 


m.-l 

m.-l 

n        i 

l 

I    I    ■ 

••  l    i 

i=k  x  =0 

v° 

i  n 


0 


Lj=k  Z=1       j-k+i 


Uz)>    i 


.€ 


Pa       (x/>) 

z=i    yi-k+i  £ 


But  for  each  (k-l)-tuple  of  possible  values  (x  ,  x  ,    ...    ,   x        ) 

i  _L    £.  K  —  J_ 

r  i 


either  I 


I 


UJ, 


.€ 


is  zero  for  all 


■j=k  £=1       j-k+i       J 

x,  =  0,  1,  . . .  ,  m.  -  1  or  there  exists  a  non  negative  integer 

a,  .(x  ,  ...  ,  x^   )  <  m.  -  1  such  that  the  indicator  is  zero  for 

x,  =  0,  1,  . . .  ,  a,  .  -  1,  and  one  for  x,  =  a,  . .  We  may  now  use  the 

same  arguments  as  in  the  case  k  =  1,  for  those  (k-l) -tuples 

(x..,  ...  ,  x,  )   such  that  the  indicator  function  is  not  zero  for 

x,  =  0,  1,  ...  ,   m.  -  1,  to  show  that 


„  v1 


i   k 
V 


n 


UJ, 


i=l  x=0   Lj=k  £=1       j-k+i 


£=1        i-k+i 


i*i> 


n     oo    k 
-  i=l  x=a  ,1=1       i-k+i 
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*m  X  Vm'  A  VJ1*1 


< 


M"  M  n 

n 


Thus: 


m.-l  m.-l  .        . 

n        1  1  r    1       k 

e  i  •■■  i  i  z  n 


i=k  x  =0 


-j=k  i=l       j-k+i 


.€ 

1 


Vq  (xj 

i=l       i-k+i 


M 
n  n 


M         m.-l 

n  l 


<  I    I    •••     I      Si 

i=k  Xl=0  ^     -0  xk=0 


i       k 


I     H     %  (X.P> 

J=k  i=i      j-k+i 


k 


n 

i=l       i-k+i 


PQ  (x£) 


M 


<       I 


Mn       M"   M  n€ 


Xl=°  «k-l-° 


WT 


< 


M"(M  )kn€ 
n 


-~WT 


Upon  combining  the  above  result  with  the  previous  k  inequalities  we 
have  the  desired  conclusion,  and  the  proof  of  the  lemma  is  complete. 
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We  turn  now  to  the  task  of  showing  condition  b)  of  theorem  2)  is 

satisfied  for  a  suitable  choice  of  (■  .   and  t  . .   The  theorem  will 

1        1 

then  give  us  an  upper  bound  for  &"\<£  >  §)      and  we  shall  then  see  it 
has  the  claimed  rate  of  convergence  to  zero.   Recalling 


%   =  i  -  k  +  1  I   1  p6     (Xi-k+^   i  =  k,  k  +  l,  ...  ,  n 

i   i  -  k  +  i  J=k  y=1  yJ-k+i  i  k+^ 


and 


defining  a.  =  l-j     we  have  from  lemma  6)  -  £  P[Q.  <  a.]  < r^r — 

i=k  n  ' 


e.   +  PS 

l  i 

Earlier  in  the  proof  we   shoved  we  could  take     i.   =  ■  ■  ■■/„ — ^  ..  \ 


and       £     =r  2k 


2e^(i-k+l)  e1 

•  "  kg2(X±)       '   «^7        "kToy-i) 
e  +  e 


26^( j -k+l)  +S . 

'  ,  l 


for  arbitrary  functions  6   and  e .   such  that 


kg(X,  +  1) 


5i^°^5i>i-k+l   ei^°^€i> 


> 


kg(X±) 
i  -  k  +  1 


log  I 


Let    i   be  the  smallest  integer  such  that 


>  - 


-k  +  l 


and  such  that   i  >  2k. 
o  — 
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Also   let 


f  g(X±  +  l)Qi  log  i 


7^ 


if     Q,   > 


i  "775 


i  =  i  ,    i     +  1.    .., 

o'      o  ' 


•*- 


0 


otherwise 


g(X  )o     log  i 


77T 


775 


H 


otherwise 


Then     v     i  >  i 


i  1 

E  LQig(x1  +  i)|Qi  -7i75J 


Q,  > 


L4i-p75. 


log   i 

7I7T 


E 


'     g(XI>      ,  1    1 

.g(x.  +  i)lQi  ^7I75_ 


Q,  > 


375 


<  M*  ■  ■  ■  y]  ■         since   from  condition  li) 


g(xt) 
!|i(x7TTT_ 


M* 


Also 


E 


[Qlg(x;+ Di^i  ^H^^ 


i1^ 


Thus 


log  i 


ErijQi  >  a^P^  >  a±l  <  ~^~  [M*  +  P] 
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Using  the  definitions  of  8.   and  e.   we  have 

11 


rk   Q.  log  i   2  Q.(i  -  k  +  l)  log2  1 
k  -   ta  eXp  \—^ ^ 


whenever  Q.  >    »  .   Thus 


E 


y«i^-^c 


P  Q..  > 


-TT75 


<  4k  exp 


4  log  i   2(i  -  k  +  l)log  i 
.lA  ki 


-  i 


noting  Q.  <  1, 

°   l  — 


Collecting  terms  we  see  that  condition  b)  is  satisfied  and 


{*(>  e)   <<k-  X>B  +i° 


i0-k+(M*+P)  I^j 

i=i   i 
o 


2(i-k+l)  log  i   4  log  i 


ki 


TW 


+  4Bk   J  e 


1=1 


2B2M  loffk  n 


+ W 


for  all  Be  0,    . 

We  have  now  given  an  upper  bound  for  o(_cp,  0)      for  all  n,  and  it 
remains  to  find  the  rate  at  which  this  bound  goes  to  zero.   Examining 
the  various  components  of  the  upper  bound  we  have 


(k  -  1)IT  =  Q|  1 
n        I  n 
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and: 


n        o  '  In/ 


f(M,tMj^  =  0|>1 


8B2k    " 

)        e 


2(i-k+l)n      2.        U  log   i 

1  ,    . "log     1     +   r^ 

ki  &  .1/4 


n 


1=1 


■  4-a 


28^1  logk  n 

175 =  ° 


i      k     ! 
log     a 


t    n 


w 


Hence 


In' 


uniformly  in  6, 


Q.E.D. 


We  observe  that  the  sum  of  r  independent  identically  distributed 
random  variables,  each  with  density  function  Po(x)  =  0Hi(0)g(x),  has 
the  density  function  f0(y)  =  0yhr(6)g^ r^ (y),  where  g^  r'   is  the 
r-fold  convolution  of  g.  Thus  the  density  function  of  the  sum  has 
the  same  form,  and  if  conditions  ii),  iii),  and  iv)  are  satisfied  for 
g   (y)   then  theorem  3)  may  be  used  even  if  the  original  problem  is 
modified  to  allow  r   independent  observations  for  each  9  .,    observing 
that  the  sum  is  sufficient  for  9    .     For  the  geometric,  negative 


binomial,  and  Poisson  families  g 


(r) 


satisfies  the  conditions. 


1+0 


B.   The  modified  negative  binomial  distribution. 


Let  P[X  =  x|0]  =  pfl(x)  = 


a  +  x  -  1\  i  iai 

/   all 


a  +  0  I  a  + 
x    l  \  I    \ 


x  =  0,  1,  2,  . . .   a>0  0  <  0  <  p  <  «>  . 

This  reparameterization  of  the  negative  binomial  is  of  interest  for  two 

reasons.  First  V  a,E[X|0]  =  6     unlike  the  usual  parameterization. 

Secondly  the  form  of  the  decision  procedure  is  different  than  that 

usually  encountered. 

In  this  and  the  following  sections  "we  shall  not  give  as  many 

details  as  in  Section  A.   The  method  of  proving  asymptotic  optimality  is 

similar  in  each  of  these  sections,  and  may  be  summarized  as  follows: 

k       Qi(^k} 
Since  \|r.(x  )  =  -r— ■> r  when  Q.(x  )  >  0,  we  seek  an  estimator 

1  ~K         ^  ^2SiJ  1  ~K 

k                                                                                                               Pi(^k) 
cp.(x  )      which  is  for  "most"      x       equal  to  a  ratio     ■=— ■> r   ,    such  that 

E[^i-k(~i)l^i   =  ^k]    =  Qi-k(^k}      and   SUCh  that      E[Pi-k(~i)l-i   =  ^k] 

=  Q.  ,  (x,  ) .   Then  using  the  methods  of  Section  A  the  functions  i. 
l-k  — k  i 

and  C,  .      may  be  obtained  for  condition  a)  of  theorem  2).   It  is  then 
only  necessary  to  show  either  condition  b)  or  b1)  holds* 


kl 


Let: 


W  = 


1   if  x*  =  ^ 

0  otherwise 
a  +  i  -  I 


g(i,  J)  = 


a  +  i  +  J  -  1 

i  +  J 


i  = 


J  « 


0,  1, 
1*  2, 


g(x,  ,  t)   if  there  exists  t  =  1,  2,  .. 


W 


such  that 


«< 


0 


otherwise 


'i<^>  " 


i  -  k  +  1 


3=k 


5.  W 


W  ■  f^i^TT 


W 


'w  lf  p^)>0 

0       otherwise 


k2 


2 


if  i  =  1,  ...  ,  k  -  1 


^i^i)  =  {    Pi^i)   if  °  <  *}(*£)  5  P   i  =  k,  k  +  1, 


if  P  <  P*(X  ) 
l—i 


i  =  k,  k  +  1, 


V 


Using  the  above  definitions  we  shall  show  that  t£     =  (cp  ,  cp  ,  . ..  )   is 
uniformly  asymptotically  optimal  of  k    order  and  c  (cp  ,    6)   <  B(k,  n) 


=   0 


log     n| 

n     '  i    k 

Recall         Q^)   =   ,  ,   k  +  1     I     JI     P0  .   v      (*j> 

j=k  .0=1        j-k+£ 


QI(^)  =  rrrn  Z 


Pe  (xj 

j=k     J   i=l       j-k+i 


We  observe  there  exists  a  set  R.   of  k  dimensional  vectors  such  that 

l 

P[XVeR.]  =  1  and  Xv£R .  =>  Q.  (x.)  >  0.   Thus  for   i  >k  and  Xv€R., 

k       Qi(^k} 
^i(2Sk)  =  ^  /  \    *      For  i  >  2k,   x^R^   J  =  k,  . . .  ,  i  -  k  we  have 


J  x  x      *   /ii  yj-k+i  ^ 


^3 


We  also  have 


k-1 


t=l.  j        ,6=1   j-k+* 


But 


I  g(x,  t)p  (x  +  t)  =   £ 

t=l  t=l 


<*   /a  +  x  -  1\  /     \a 
.1      a 
a 


x     \ 


a  + 


a  + 


x+t 


*  a%(X)  Jl  l1^) 


=  9VJX) 


Hence 


*l?±(£)  |3$  -  xj 


1  -  k  +  1   ^i-k^'     i 


^n     I.^»L^-\1 


j-i-k+1 


^(4)1^  =  ^1 


-  2k 


rr  «iV^> +  t^t     Z   «*,<£)  l£  =  aj 


j=i-k+l 


The  remainder  of  the  proof  follows  that  in  Section  A) . 


kk 


The  binomial  distribution. 


■  a\„X/_,    „\a-x 


Let  P[X  =  x|0]  =  p  (x)  =  (  )e   (1  -  0)     x  =  0,  1,  ...  ,  a  where 
a  is  a  known  positive  integer  and  0  <  9  <   1.  For  this  family  it  is 
necessary  to  modify  slightly  the  definition  of  asymptotic  optimality. 
Rohhins  [5]  and  others  have  demonstrated  why  this  modification  is  neces- 
sary.  Let  R   (9   )   be  the  k    standard  as  defined  in  definition  3) 
k,  a  n 

with  the  parameter  value  a.   We  shall  develop  a  procedure  <£       such  that 

c)  Tim  {sup  [Rn(/,  9)   -  Rk  a-1(in)l]  <  0  . 

n— >  00     Q  * 

Such  a  procedure  will  be  said  to  have  property  c)o   In  addition  we  shall 

show  R  (cpk,    0)    '   \   a-l--n^  -  B^k'  n^  =  °  I^iAI   uniformly  in  0. 

We  shall  first  exhibit  a  procedure  having  property  c).  We  shall 
then  introduce  a  new  procedure  which  not  only  has  property  c)  but  for 
most  G     actually  improves  upon  the  original  procedure  at  each  stage  and 
produces  strict  inequality  in  equation  c)» 

We  first  assume  that  corresponding  to  every  observation  X  we  have 
available  the  related  observation  X'  which  would  have  resulted  had  we 
observed  a  binomial  random  variable  with  parameter  a  -  1.  For  example, 
if  X  is  the  number  of  successes  in  a  independent  Bernoulli  trials 
with  probability  6     of  success,  then  X'   is  the  number  of  successes  in 
the  first  a  -  1  of  these  a  trials.  While  in  most  situations  this 
assumption  will  hold,  we  shall  see  later  that  it  will  not  be  needed. 


^5 


Let: 


£»=( 


(.X .     ,  o  «  >  ,  X .  J 
i-k+1        1 


if  v  =  0 


■iiV 


(X!  .  .., 


...  ,  X!  n,  X.)   if  v  =  1 

'   l-l   l 


W 


1  if  X*  .  =  x. 

-j,0   -It 


0  elsewhere 


w = 


\  +  i  if  ^5,1  =  W'  ••• '  vi'  \  +  x) 


otherwise 


'i<*> 


?  W 


J=k 


i-k  +  1 


^  " 


■1=k  J 

a(i  -  k  +  1) 


W 


l(Sk> 


if  P.Cx.  )  >  0 


Pf(xk)  =  < 


otherwise 


ij-6 


k 


k(x.)  =( 
i—i    \ 


f~.  II    X  —  _L  j      •  •  *  j     ri    ™  J_ 

PJ(xJ  0)  ^  o  <  PjCxJ  0)  <  i  l  =  k,  k  +  l,  ... 

1        if  1  <  Pftf  J      i  =  k,  k  +  1,  ... 


if  1  <  P*(X^  J 
l  — 1,0' 


k   /  k   k 
We  shall  now  show  _cp_  =  (cp  ,   cp  .  ...  )  has  property  c) .  We  shall 

proceed  as  in  the  previous  examples,  noting  that  theorem  2)  is  still  true 

when  the  property  of  asymptotic  optimality  is  replaced  by  property  c). 


V  i  >  2k  let : 


i   k 


<M^  -  i-k  +  i  I 


a  -  1 
x 


j=k  /=1  I  I  l 


x 


j-W1    ,j-W     X 


R.   be  a  set  of  k  dimensional  vectors  such  that 
l 

P[X^  0€Ri]  -  1  and  xkeR1 ->  Qj  (^J  >0  . 


For  x,  eR.   we  have  for  the  parameter  a  -  1: 


♦ft^ 


+ 1  £  e 


a-l-x, 


^W1 


8.i-k+i> 


1*7 


Now  for     j    =  k,    ...    ,    i  -   k 


k     a  -  l\   x 


1   JT-i"-!       -V-  Jl,      x,         ,i-k+iv  .i-k+i; 


a-1- 


/=i  I  xi     I  J~k+i  j-k+i' 


EfZ.Cx1^)!}^  -  xj 


\ 


I   x.  +1 
U    +  1     J 


a-x  -1  k-1 


a  -  1    x, 


a-l-x 


'j-k+r1  "  Gj-k+i^ 


k     a  -   1     x.  a-l-x. 

J    AM    v  j-k+rx         j-k+i' 


Thus  arguing  as   in  Section  A) 


\fa)-itej\>^l\  +  *\)\£,o-£t_ 


ki 


te 


-r(i-k+l)5f+i|-8. 

k  ii 


for     B,   >- 


i  -  i  -  k  +  1 


We  now  look  for  an  upper  bound  for  the  quantity 

n    r   i   k 

£  P   £     Pa     (X.  v+fl)  <  i  .  We  shall  show  an  upper  bound  is 
i=k   LJ=k  /=i  0j-k+l  J 

(a  +  1)V. 


I P  U  &  -w^5 <  i£]  - 


^3 


i   k 


I    -  I     E  i    E  I  pe       (x,),  i' 

x  =0    x ,  =0  i=k   L.i=k  i=l   .1-k+J 


k 

=1   i-k+i 


For  any  fixed  X,,   x.  =  0,  1,  ...  ,  a  i  =  1,  2,  ...  ,  k.   There  exists 
a  subsequence  of  integers   {i  }   (possibly  finite)  such  that 

p„     (x.)  <  i  ^  i  =  i   for  some  v  =  1,  2,    . ..  .   Let 
j=k  i=l  Vk+i  V 

i  =0.   But  for  all  n  >  k  there  exists  a  v   such  that 
o  —  n 

i     <  n  =>  v  <  v       and : 

v  —  -     n 


i=k 


I     11    Pe  (x*)>    ±X 

j=k  i=l       j-k+i 


it 


p*       (x^) 


J  i=l       i-k+i 


<  i  n  p. 


(xj   <  if,     <  n£ 


i=]  i  Vn 


Since  there  are   (a  +  l)   different  x   to  sum  over,  we  have  shown 
the  claimed  upper  bound  holds. 

It  is  now  easy  to  show,  using  an  argument  similar  to  that  in 
Section  A,  that  an  upper  bound  for  R  ((£,   a)      is 


j^i  +  2(1  .  k)  a  i  i2£i 

n  no  n    .  V         1/4 


1=1       n 
o 


8k     " 

—     I      exp 


%  log  i       2(1  -  k  +  1)    .2 


1=1 


"T7+" 


ki 


log     i 


2(a  +  1)J 


n 


W 
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and  clearly  this  upper  bound  is  uniformly  of  the  order  ■  y /i -n  .   The 
desired  conclusion  follows. 

In  the  above  procedure  we  chose  at  times  to  neglect  the  results  of 
the  "a   trial"  in  many  of  the  observations.  This  choice  of  which 
information  to  neglect  was  quite  arbitrary,  and  it  is  easily  seen  that 
the  above  proof  does  not  depend  on  which  trial's  information  was  neglected. 
We  may  thus  conclude  that  if  in  some  situation  the  related  observation 
X1   is  not  obtainable,  we  may  construct  a  new  X*  which  will  do  as  well. 
An  example  will  illustrate.   Suppose  a  =  17  and  X  =  10.  With  the  aid 
of  some  random  device  we  let 

9  with  probability  10/17 

10  with  probability  7/17 


This  X'  will  work  as  well  as  the  original  X'. 

'"k  k 

We  shall  now  exhibit  a  procedure  C£   which  improves  upon  (£    , 

th  -c 

the   i    stage  of  the  decision  problem  we  could  have  defined  cp".   in 

any  one  of  several  different  ways,  depending  on  what  information  we 

chose  to  neglect.  For  fixed  i  >  k  and  x   there  are  a   ways  to 

define  X^t    _  and  hence  Y,(x,).   Thus  there  are  a  U~  ~  '      ways  to 
— J,0  J  —  K. 


At 
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define  P.(x,  ),   Similarly  there  are  a     ways  to  define  Z ,(x.)   and 

hence  a  '  ways  to  define  P.(x,).  Thus  there  are 

a  ways  to  define  P*(x,  ) .   Prom  the  above,  and  observing 

the  XT  _  used  in  P*(X7  ~)  could  have  a   different  definitions,  we 
—1,0  1  — 1,0'  7 

see  there  are  at  least  a     '*         ways  to  define  the  random 
variable  9j(X.).  Most  of  these  different  definitions  will  result  in 
essentially  the  same  estimator  for  large  i.  We  may  obtain  an  improved 

procedure,  however,  by  considering  some  of  them. 

Jc  k  1 

We  define  XV  /  \   u  =  1,  ....  a   as  follows :  Let  u  =  1,  ...  ,  a 
-i,(u)       '     ' 

be  an  indexing  of  the  a   distinct  k-tuples  each  of  wnose  elements  are 

th 
integers  from  the  set   (l,  2,  ...  ,  a}.  Let  the  u    k- tuple  be 

(t   .,  ...  ,  t   ).   Let  X  '  I  =   1,  ...  ,  a  be  the  random  variable 

U,  -L  U,K. 

derived  from  X  by  not  counting  the  result  of  the  I         trial.  For 

(a) 

example,   X   '   equals  the  previously  defined  X'.   We  now  define 

v  (\   i)    (tu  2)  (tu  k) 

XV  /  \   to  be  the  random  vector  (X.  ,*,X.   '_,.••,  X.   '   ). 
— i,(u)  v  i-k+1  '   i-k+2  '     '   i 


/ 

1 


Let:   q)k  (X.)  =  / 

i,u  — i    " 


if  i  =  1,  . . .  ,  k  -  1 

p*(xv"  {   J  if  o  <  p*(xt\/  s)  <  1  i  =  k,  k  +  1,  . 
i  — i,(u;        —  i  — i(u;  — 

1         if  1  <  Ptrf,  0      i  =  k,  k  +  1,  . 

i  — i(u;  '      ' 


V 
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Let:      R(<p,    0.)    =  E[(cp(X.)    -    0.  f) 


~k 


a 


<P7(X.)  =  -r     V    cp"     (X.) 
Ti  —  i  k     £-l       i,u— l 


~k        ,^k     ^k  N 


We   shall  now  show     \f  9,    V  i  RC'P-^    #•)  <  R(9-.>  j?-)     where     cp       is 
as  previously  defined.     We  first  observe  that  the     cp.      (X.)      are 
identically  distributed  as     cp.(X.).      Then,    supressing  the   superscript     k, 
we  have 


r(5.,  e.)  .  e[(5.(x.)  -  0.)  ] 

=  E[^]  -  201e[51]  +  e\ 


2k  L 
a       uu=l 


£    E[q£     ]  +  2     I    E[q>.      <p      J 


l.u 


u<v 


l.U    1,V" 


26.      a 


r     I    E[cp.     ]  +  er. 
k     f-       LTi,uJ  i 


a     u=l 


=  EKcp.-e.n  - 


a     -   1  „r_2- 


2 


E[cpf]       -~     X    E[cp.  cp.      ] 

k  Yi  2k     ~  i.>u      ri,v4 

"-a  a       u<v 


R(<Pw  *<)  "4c:    I 


i*  -i' 


2k 
a       u<v  L 


E[cp2      ]    -   2E[cp.  cp.      ]    +  E[cp:      ] 

Y1,U         T1_,U   T1,V         1,V 


■  R(<p.,  e-)  -  -k    I   E^-      -  V*     ^ 

x  ri'  —  i'    2k  ^r      i,u    i,v 


a   u<v 


<  R(  V  £±) 


we  notice  strict  inequality  holds  unless  P[cp.   (X.)  =  cp.   (X.)]  =  1 
for  all  u,  v  1  <  u  ,  v  <  a  ,   Thus  we  have  strict  inequality  holding 
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unless  a  =  1  or  unless  07  is  composed  of  l's  and  O's*  To 
investigate  the  asymptotic  properties  of  <g    we  observe: 


'   "=  %     I  tR<V  ii>  "  RK>  ^l)]  +  \  X   B(V  Si)  -   \,a-l^n)} 

n-»  w  ^  i=l  i=l  J 

<l^j;     I  faC?..*  M*)  ~  *(%>  Jjf     since     $     has  property  c) 

n_s.  ooLU     -1-1                     X  X                          X                  -> 


n->  °°  *-     i=l 


11m 
n->  w 


{-  \%\  I  "«%,  -  *i,/i 

>^         i=l  a       u<v  '  7 


<  0 


Hence     ^     has  property  c).      In  order  for  strict   inequality  to  hold  it   is 

sufficient  that  there  exists     €  >  0     such  that,   with  the  possible 

exception  of  a  finite  number  of  values  of     i,        Ze[|<P.        -  9-      1 1   >  C. 

u<v  '  7 

If     a  w  1     this  condition  is  never  satisfied.     For     a  >  1,   however,    and 

for  a  large  class  of     $     such  an     c     will  exist.     Let     ft*     be  the  set 

of    j)     such  that    V   S<^*     there  exist     £    ,   £        such  that 

O<|<0.   <£<1     for  all  but  finitely  many     i     and  such  that  the 

first  order  empirical  distribution  function  of     9       does  not   tend  in 

the  limit  to  the  distribution  function  of  a  degenerate   random  variable. 

It  may  then  be   shown  that     a  >  1     B^O,*     implies 

]T    E[|cp  -   cp        |]  >  €  >  0     for  all  except  possibly  a  finite  number 

u<v  'U  'V 

of     i. 
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Since  the  sum  of  independent  identically  distributed  binomia± 
random  variables  is  again  a  binomial  random  variable,  and  since  the  sum 
is  a  sufficient  statistic  for  9,    it  is  clear  the  methods  of  this 
section  can  be  applied  to  the  case  of  r  independent  observations  for 

each  9 . . 

1 

D.   The  normal  distribution 


x 

Let  P[X  <  x|e]  =  /  pQ(t)dt 


X 

r 

-  H—f 

1             2l    a   1 

S                                       1 

j 

-OO 

72rt"  a 

dt  -oo  <  x  <.  °° 


for  a  >  0  and  -«>  <  a  <  e  <  p  <°°  . 

For  the  present  we  assume   a   is  known,  although  later  we  shall  modify 
this  assumption  somewhat.   As  we  shall  see,  the  estimation  procedure 
in  the  continuous  case  is  similar  to  that  in  the  discrete  case.  Without 
loss  of  generality  we  take  a  =   1.  We  fix  k  >  1. 

Let   {c . ,  i  =  1,  2,  . . .  }   be  a  sequence  of  positive  numbers  such 

hi k+i) 
that   lim  c.  log  1  =  0  and  lim  i(c.)       =  °°  . 
l  .1 

1  ->  oo  l— >  00 


5k 


en   be  the  k  dimensional  vector  consisting  of  zeros  for 

— K 

all  components  except  for  the  k    which  is  equal  to  one. 


^iW  "  < 


1      If     y,  -   8±  <  XyM  <Yi  +   ci     I 


0     otherwise 


=   I,       ..    ,    k 


i-k 


■itok)  -  -r^ 


(i    ■  k  +  l)(2c.) 


g±(zk)  - 


fi(jk  +  OA1  -  fi(£k  "  CA) 


2c 


y„  + 


!itek)    . 


*+W   -    *i 


f     f .  (y  )   >  0     and     i  =  k,  k  +  1,    . 


*Kzk> 


vy* 


otherwise 


a 


if     P*(x^)  <  a 
<p*(x.)  =  {  P*(x*)     if    a<P*(x*)  <p 
p  if    p  <  p* (xj) 


k  k       k 

We   shall  now  prove  the  decision  procedure     _cp     =   (cp  ,   cp  , 


r   T2; 


th 
uniformly  asymptotically  optimal  of  k    order 


.  )  i 


IS 
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We  shall  show  the  conditions  of  the  corollary  to  theorem.  2)  are 
satisfied.  . 


^t.   <^k)  =  Z  Is)  e 

,1—  Jl 


i      2  ~2  ^^i^j-k+i' 


Wk>  =  -ST- 


£w    «     ^2 


i         2  ^2  /  (yi"0j-k+P 


Now  for  any  _y ', 


1  r  i  »2 


But     q(ik)  -  -  V  (yk  -  6  )|i|   e 

J-K. 

Hence   ^(y^  -  yfc  +  ^-y  . 
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Let :        m . 

1 


tek>-  ntj^-xj 


k  2 


i-k  |-  -|  ^(ytr0j.k+i)2   -I  ^1(yi-ej-w) 


/     n       j=k  L  

Z     (y )     =  Jl __ 

1     k  (i   -   k  +   l)(2*)k/2 


for  some   y*       such  that     y.   -   c.    <  y$    .<y/7  +   c. 

-K,J  *  1  £,J  -K  1 

£  -  If    ...  ,  k.  What  particular  y*  .   is  intended 
will  he  clear  from  the  context. 


Then: 


y/Ci       1/,  «     x2 


1/j.  o         \ 

/   x         1       >"  ft        /     1    2V*"  j-k+i' 


i-k  k 
m.(y,  )  =  -r-. r  v  -,  %   7  IT  rsr-      —  e  ~    "  """  dt 


yrci 


j=k  i=l   i  ww_m  V2ir 


i-  k+1 


+  H<Zk) 


where  the  y*  .  in  z.   are  those  vectors  whose  components 
arise  from  use  of  the  mean  value  theorem. 


k      oo 
Now:   for  V  i  =  2k,  2k  +  1,  ...  ,   y  eR  ,    9e   ft 
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If.tf)  - 

1    lx— l' 

i   -   k  +  1 

<  p 

i   - 

k  +  1    | 

i   - 

2k  +   l,xi 

>  Sj^  ■  2k 


k  +  1 


(S.  -  I  z. 


-  2k  +  lv  i   '  i1    i  -  k  +  1' 


but  from  lemma  3)  this  probability  is 


.  2  (i  -  ^  +  1^(5.  -  |z,|  -  .   *   J2(2c.)2k 

k       l    '  l1    l  -  k  +  1     l 


,02(k+l)  2kR    02k+l/i  -  k  +  lx  2k/R    i   n2 
ci  S±  -  2    ( )c.  (8jL  -  |z1|) 


provided  8 .  -  |  z . 


i   'i1    i-k  +  1- 


>  0  . 


In  a  similar  manner: 
Let: 


■i(zk)  = 


^(Zk  +  cA)    -   z.(yk  -   c.ek) 

2c. 


i      /Qj-k(^k  +  CA}  -  Qi-k(^k  -  CA}       Q1    ,    , 

*l(XlJ    =   i  -  k  f  1  — 2c.        ™~     "   QI-k(lk} 


Then 


*[gi(x*)  £  ■  lkl  ■    X    *       lkgc.    X    k       lk 


«utf> 


-=== +  q    (y  )    +  z'(y  )    . 

i  -  k  +  1       ^i^k;  i^k; 
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Hence 


P    |g,(x?) 


Q!(xk) 
1—1 


i  -i'        i  -  k  +  1' 


>  «±i3q  ■  ik 


<  2k  exp 


2(i  -  k  +   l)f 
k  k 


6 .    -      z 

i        '    i1 


Li'         i  -   k  + 


2   (2=.)2(k+l) 


<  2k 


f,n      N2  k+l)            02k+l  2  k+l)    i-k+1,  i     , i         i       i  \2[ 

exp  ^(2c.)v         '  e  .    -   2  c.  '  : ( e.    -      z        -      q.)     f 

*   [^      i'  i  i  k  i        '    i1        '    i1      J 


provided     e .    -    |  z '.  | 


*i' 


i-k  +  1    - 


>  0 


Thus,    using  an  argument   similar  to  that  used  in  proving  lemma  k)   and 
letting     B  =  max[|os|,    |p|],   we  have: 


i-k+1 


<f.(X.)   -  *.(X.)|    >    ft'4)     (£i  +  Ki  +  lykM£  "Ik 


,02(k+l)   2k~          -2k+l,i-k  +  lx    2k,.  i       i»! 

C  2k  exp  \2  'c±  B±  -  2  ( g )c±   (5j,    -    |zjL|) 


f/.      \2(k+l)            02k+l  2(k+l)    i-k  +  1/  i     ,i         i       n2l 

2kexpj(2c.)^  ei  "   2  V  £ (ei  "    '  Zl'    ~    lqi'  )  J 


provided     5  .    -    |  z  .  |    -  - >  0 


i1         i-k+1  - 


-    q 


i1        i-k  +  1  - 


>  0 
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To  complete  the  proof  it   is   sufficient  to   show  that  for  some  functions 

cifak>  -j.)      and     8j.(Zk>  -±)>    such  that  either     e±  >  h||    +   \q±\    +  j  _  g  +  ± 

i      i  k 

or     €i   =  0     and  either     8     >  Jz   |    + ■  ■ •         or     5.    =  0,   the  following 

limits  hold  uniformly  in    Q, 


i) 

11m  E 

i-»  «> 

ii) 

11m  E 

i->«> 

i  -  k  +  1 

Q±(xJ) 


>.&) 


-  0 


1  "  V  1  (B  4-    1X^)5   (xj) 


ill)     11m  E 


i-»  oo  L 


-icf(5i(^)-|Zl(4)|): 


5.     -        Z.        > r-Ty- 

1  1    —  /j       1-1-  ^1/^ 

(l  -  k  +  1)    '    J 


=  0 


iv)      lira  E 
1-*  oo 


r-iof^.^H^JI-K^IV 


«±-  Kl  -  hL\  >- — L— 175!-° 

(i-k+  l)  '    J 


i-» »  l  (i  -  k  +  1;   '    J 


-   0 


vi)     lim  P 
i-»  00 


11  (i-k+  l)   ' 


=  0 
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Let 


5-  -< 


Q, 


Q, 


if 


>  z.  + 


(i  -  k  +  l)log  i  U   (i  -  k  +  l)log  i  -  '  i1    (i  _  k  +  i)1/1^ 


otherwise 


Q, 


Q, 


-r  if 


>  z!   +  q.|  + 


(i  -  k  +  l)log  i  1X  (i-k+l)log  i^lzi'  T  l^il  "  (i_k+i) 


?75 


0 


otherwise 


Since  there  exists  U  <  °°  such  that  E[|x|]  <  U  for  all  0cfl,  the  first 

two  limits  hold  uniformly.  The  third  and  fourth  also  hold,  recalling 

lim  ic.^     =  °°. 
i-»  w 

Since  V  H  >  0     there  exists     S(t])   <  »     such  that     P{|x|    >  S]  <  r\     for 

all     0€ft  ,  we  have i 


5  .     -     I  Z  .  I     <  i :— n~ 


<  p 


Q, 


(i  -  k  +  l)log   i  "     I    i 


< 


(i  -  k  +  l)1^ 


lXi-k+i'   ^S     i  =  lf    "•   >   k 


+  i  -  (i  -  n)J 


6l 


i  /  .  ik/2  T,A/Xi-k+.T0j-k+i' 

aW  e 

i  -  k  +  1 


.    k  log  i       i   I  -,   ., 
< msi — _  +  z  log  i 

(i  -  k  +  i)1'*         x 


x 


i-k*J  -s  J  *  1»  *••  ' 


+  1  -  (1  -  n) 


k 


<  P 


1  \k/2  -|k(S+B); 


LI2*/ 


e 


.    k  log  i     .IK 

< ^ T7I7  +  K  lQg  x 

(i  -  k  +  l)1^     X 


Xi-k+il  <S 


+  1  -  (1  -  *l)  ■ 

Clearly  the  fifth  limit  will  hold  if  we  show  lim  | z  (y_ ■  ) | log  1  =  0 

uniformly  in  6&l      and   |yJ  <  S  £   =  1,  •••  ,  k.   By  a  similar  argument, 

to  prove  the  sixth  limit  holds  we  need  only  to  show 

lim  (|zj_(jfk)  +  1^(^)1)108  i  =  0  uniformly  for  0e  Q      and  \y£\    <   S 
i_»oo 

*  =  -L,   •  •  •   j  K-m 

We  first  consider  lim  |  z.,  (.y,  )  |  log  i.  For  y.  -  c  <  y*  <  y -  +  c 

i_^oo 

i  =  1,  ...  ,  kj  y  j  =  k,  ...  ,  i 


1  k 


-  e 


1  k 


): 


=  e 


T^yt-^-k+i^ 


1  ~  e 


■i=i 


Vk+i)2-(y!-ej-k+p2] 
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But 


<  c1(2S  +  2B  +  Ci)   for  all     04k+ie: 


Now 


x     <  1=»    1  -   e 


-   I    I   ~l   <  1*1    (•  "   1) 

v=i  vl 


Thus 


liM    = 


i-kr  ~2^1(yt,j~  Vk+i^ 
I    e 


"2^1^yi""9j-k+i^   - 
-   e 


(i  -  k+  1)(2jc) 


k/2 


< \fe  c±(S  +  B  +  Cl)(e  -   1)  k   for  all    £   and     i     suffi- 


(at) 


ciently  large. 


But  since  lira  c  log  i  =?  0  we  have  shown  lim  |  z  |  log  i  =  0  for  all 
i->  a* 


1~»  oo 


0     and  |yJ  <  S  i  =  1,  ...  ,  k. 

'11 


We  now  consider  lim  |  z_|  |  log  i.  This  limit  is  more  difficult  to 
i-»  oo 
evaluate.  By  examining  the  argument  leading  to  the  consideration  of  this 


limit  it  is  clear  that  we  need  only  show  lim  | z ' | log  i  =  0  for  y,  such  that 

i->  oo 
|yj  <  S  for  £   =  1,  ...  ,  k  and  yk  /  Q       for  j  =  k,  k  +  1,  . . .  „  We 

shall  use  lemma  5).   It  is  clear  that  there  exists  M  <  oo  such  that 

CL^  =  (-<»,  oo )  for  all  0e  ft.  Thus  we  have  for  i  >  k  and  y   such 

that   |yi|  <  S  and  yfc  f  0       £   =  1,  ...  ,  k  J  =  k,  k  +  1,  ... 
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,k-l 


2iM 


v-     e  \e  °        -   e 

k72 


J=k 


|(yj**- */ 


2ci(i  -  k  +  l)(2ir) 


y    e L2 -   e u 

2Cl(i  -  k  +  l)(2jr)k/2 


J=k 


vhere     y^  -  c±  <  yj  <  y,  +  o±       y**  =  yR  +  2p^Je1 


*k**  =  yk  "   2p2,JCi 


li     ^  l*V&t  M  e 


v,j 


21    < 


3lyk  "   ej 


V  =  1,    2 


i-k    "U^a-*^ 


,k-l 


and         z 


<  I 


:  j=k  (i   -  k  +   l)(2,t)k/22o. 


-iZ(yl-y,)(yy-yr2ej-k+P 


e  -   e 


1/    2 


-    e 


-==<c>2c,y,  -2c,0.)  2c.(y,-e.) 

2V    l        i^k       i  jy,  ^     iwk     J'v 
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We  shall  now  consider  various  parts  of  the  right  hand  side  of  the  above 
inequality. 


V j  =  k,  .0.  }    i  and  i  sufficiently  large: 

1  k"1 

2  A  (yl-  yP(yi  +  y^ 


0   Let  L    *   -  I 
1'J   v=l 


IV 


!=1 


2ej-k+P 


vi 


then 


^  j|  <  c^e  -  1)(S  +  B  +  Ci)(k  -  1) 


.k-1 


and  e 


■v—JL 


-1  +  5 


1,J 


b)  Let  £ 


2,J 


I 
v=l 


("  2  Ci 


-  c.y.  +  c.e   ) 
ik    i  J7 


vi 


then  |£2  |  <  Ci(e  -  l)(S  +  B  +  c.) 


1/  2 


and  e 


■^i+2ciyk-2ciV  =  x 


+  s 


2,j 


-    c-[2(yk  -  OJ 
c)    M    53,j"Ji (v  +  i): 


v+l 


then     \i        \    <Ci(22(B+S)    .   i) 


and 


1  -   e 


2ei<VV 


=  -   2(yk  "   V    "    h,l 
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-f(yk**  -  yk)(y£*  +  yk  -  zejl 


d)    Let    L    A  =    y  — £-Jb £ — £ 

^       V=2  V1 


00   t-  |(y***  -  yk)(yk***  +  \  -  2d^ 

V=2  Ci 


V 


(S+B+c.) 
then      |^|    <  2c.(e  1     -   l) 


-^(yk**-yk)(y*-*  +  yk-20.)        -^^-y ^(y***^-    20  ) 

e  -   e 

and     


c . 

1 


■JHt 

-  (yk  -yk)(yk^+yk-2ej)+  (y***- yk)(y***+  yk-  20J 

2c~  +  ^,j 


e)     Let     g         =  2c.(p^   -  p^) 


then     |£        |    <  2c. 


(yk*  "  yk)(yk*  +  yk  "  20j)   "   (yk**  '   yk)(yk**  +  yk  '  2V 


and  2c 

1 


(y**)2  -  (y***)2  -  20j(y{*  -  yk***) 

=  __ 

1 


2c. 
1 

2,    2 


^1^1,3  *  P2,j}  +  4cl(pll3  "  P2,j 

2c. 

■  2K  -  V(pi,j  +  p2,j>  +  h,t  ■ 


\  -  *Vi(*i..i +  pa..i> 
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f)  Let  e6,J  =  Pl,j  +  P2,J  -  1 


\r        l<r  I  li         I  1.    .  8^  2(B+S) 

then       O    A<    Pn    .   "  o     +     Po    a   ~  o \    <  c<  "^TZ ST  e 

°>J  1,J        21  2, J        2'   -     I   5| y     -   6    | 


and     2(yk  -  Bj^  +  p2>J)  =  2(yk  -  8j)(i  +  ^ 


where   u7   ,  <  0±  «*k  J(E+S)£ 


Using  the  above  six  results  we  have  for  sufficiently  large   i: 

1  ~  j=k  2(2l)k/2(i  -  k  +  1) 

<  M*c .   where  M*  is  some  finite  constant  independent  of  Q_,      ±, 


and 


^k 


Hence  lim  | z' | log  i  =  0  uniformly  in  _£  and  y,   such  that   |  y  J  <  S 

*  -  1,  ...  ,  k  and  yk  ^  ej  J  «  k,  k  +  1,  •• 

We  now  consider  lim  |qi(yk)|log  i  for  ( y^  ]  <  S  i  =  1,  . ..  ,   k 
i-»  °° 
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,k-l 


w     x       T^^j-wj)2,  -|<yk+v^ 


q±tek) 


„  J=k(2n) 


k72 


-   e 


2™k  ci     y 


2(1  -  k  +  l)c 


i  -  k 


1-k 
k  +  1   .\    'yk 


k/2     "2  jR7*"9*-**** 

,)  Is)      e 


So 


_i/y    +c     _£)      )?  _l/y     _c     _0      )! 

i-k      Fk    i    y  2^yk  ci    y 

KM  5  i  _  k  +  i   I 


2c 


+  (yk  -  »,) 


■K-6/ 


1/   2 


1/    2 


5  — 


i-k 

- —  y 


H  0^20^-20.6  j  )  -g<o1-2o1yk^oiej) 


-   e 


2c 


+  <-\  -  V 


let     £j- 


E 

V=2 


t-i<T  +  y* 


VI 


2c±Vt 


V=2 


r_c     (_i     _     y        +0      )] 

L   C±K  2        ^k         y1 

2c .  v ! 

l 


B+S+c 


then 


y    <  2Ci(e  X  -   1) 
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~kc2+2c.y.  -2c. 0.  ~(c2-2c.y  +2c.0  .) 

2     1        lk       l  j  2      i        i  k        ij 

and  $— ■ "  e         =  -  yv  +  6  .  +   £  .  • 

2c,  Jk    j   bj 


B+S+c . 
Hence   |  q. .  ( y,  )  '2c.(e     1  -  l)   for  all  9,   y   such  that   |y.|  <  S} 

and  lim  |q.(_y_ ,) | log  i  =  0  uniformly  in  £  as  desired. 
i->  oo 

This  completes  the  proof  that  the  decision  procedure  _cp   is 

th  k 

uniformly  asymptotically  optimal  of  k    order »  (£       was  defined  for  the 

case  a   =  1.   If  we  had  kept  arbitrary  o  then  (£       would  have  been 

defined  in  the  same  manner  except  that  g-(^v)  would  have  been  defined 

as  ■ — ■ — — — — — — — ■ o   If  we  relax  the  assumption  o 

2c . 

i 

known  to  the  assumption  a  unknown  but  equal  for  all  observations,  then 

it  may  be  shown  that  if  a.   is  an  estimate  which  converges  in  probability 

to  a       uniformly  in  6,   we  may  replace  o       with  a.      in  the  definition 

of  g.(y  ),  and  the  resulting  decision  procedure  is  still  uniformly 
i  **"k 

asymptotically  optimal. 

If  the  problem  is  modified  to  allow  r  independent  observations 
for  each  6  , ,  then  since  the  sum  of  these  r  observations  is  sufficient 
for  6  .   and  also  normally  distributed,  the  above  procedure  will  still 

apply.  We  note  in  this  case  that  if  the  common  variance  is  unknown, 

-2 
then  for  each  i  the  usual  estimate  a.      is  independent  of  0.      and 

1   n  -2  X  2 

—  To.      is  a  consistent  estimate  for  a  . 
n  .^.   i 

i=l 
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E.        The   ^aimna  distribution. 


x 

Let     P[X  <  x|0]    =    /   Pe(t)dt 


rR 


(et)a'V0tdt 


0    <   X    <   oo 


for     a  >  0 ,  O<a<0<p< 


We  assume     a     is  known,    and  fix     k  >  .1. 


Let      {c}>      e.  ,  Y  .    .,    f.,    and     g.      be  defined  as   in  Section  D. 
l  — k        J  f  I        l  l 

Let: 


-   1       gi^k}      . 


—7 r     if     f  .(y.  )   >  0     and     i  =  k,   k  +  i 

yk         fi^ 


W    = 


otherwise 


'k 


<(Sl) 


P* 


IP 


if    P*(X  )  <  a 
i  — i'   — 

*(xj)     if    a  <  PJ(X^)  <  P 


if  p  <  p*(x^) 

—   l—i 


k        ,   k       k 


We   shall  prove  the  decision  procedure     cp     =    (cp  ,    cp   ,    ...    ) 


1'  Y2" 


th 


uniformly  asymptotically  optimal  of  k    order 


;   is 


TO 


We  shall  show  the  conditions  of  the  corollary  to  theorem  2.)   are 
satisfied.   Since  much  of  the  argument  is  similar  to  that  in  the  normal 
case  we  shall  omit  many  of  the  intermediate  steps. 


i 
Lett   Q  (yj  =  Y, 
J=k 


ft  ei-M     a-1  e-^Vk+i 


Qi(yk) 


i         X       ^  *     1   a.  0 


J-k+i  a-1  ~y/ j-k+i 


Then  for  all  y,   such  tha";  y .  >  0  i  =  1,  .  .  .  ,  k  , 


But 


"     9 j-k+i     a-1     ~y/ J-k+i 


<**>  ■  mi ■*& * 


J     p  ^k  j 

TO  e 


(a  -  l)y 


a- 


2        Q     a-1 


J   k 


a-1 


Qi(jk)  -  Q!(yk). 


,k,      ,       a  -  1       qltek> 
Hence     ^  =  —  -  -^    . 

Let,       m^)   -«*<xj)  |lj  -  J^l 


;t^k)  -  i-k  +  i  X  LJi  ^tsr  (3rl,j> 


,a 


9J-k+i   /      xa-1     "^j-k+i 


■JLw^ 


for     v^      such  that     7j   -   e±  <  y}       <  y,  +  c± 


Then: 


Qi  k(^k) 
i^k}  =  i  -  k  +  1  +  Zi(^k} 


for   suitable     y*    ., 


and  for     i  =  k,   k  +  1,    ...     y,       such  that     y.  >  0     i  =  1,    . . .    ,    k     0_eQ 


Q02(k+i)   2k            „2k+I   i-  k+  1  2k, _         ,       ,v 
<  2k  exp  -<  B2  c .   5  .    -   2  - -c  .    ( 5  .  -     z  .    ) 

li  k  i        i      '    i1 


kB 


provided     5.    -      z.      — . 

i        '    l1         i  -  k  +  1  — 


>  0 


vhere     B   -  max  ■{      3    -- -, — — 
L        Ha) 


L(a  -    l)a_1  -(a-1) 


-k 


,  ir  • 
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Let°    zi^yk)  = 


z.(yn     +   c.e.  )    -   z.(y.    -   c.e,  ) 
1  — k  l— k'  i  -x-k  l-k 


2c 


qi(*kj  =  TTT 


^kdk   +   cA)    -   Q.„k(yk   -    c.e.k) 
2c. 


'U^ 


Then: 


E[g±(4) ix*  ■  ik-  -  T~r^i +  V^k> +  ^dk) 


and  fcr     i   -  k,    k  +  1,    . . .        y,       such  that     y .  >  0     £  =  1,    . . .    ,   k     0e  fl 


D„/0      \2(k+l)  02k+i  2<k+l)    i  -   k  +   1    ,  r     .  i         i       i  *i 

.  2k  exp     B*(2ci)  ■  € .    -   2  c±x  k  (e.    -    |z'|    -    |  q.±  |  ) 


provided     e.    -    |  z\  |    -    |  q_. 


kB* 


i  -  k  +  1 


>  0 


where     B*  <  °°     is   such  that        max      (p   (x)) 

u 
x-y>0 


k-1 


SpQ(t) 


<  B* 


t.=  ' 


Thus  we  have : 


i  -  k  +  1 


P     |cp.(X.)   -  \|r.(X.)|    >      "/      i        U.    +  P&. 

L        1-1  1    "I'        -       Qi(Zk)  X  ! 


a  —    i  c    ■  -  i  -.rk 

&  .     I  X  .    =   y 
yk       i    '-i       ^k 


^(k+l)  2k           _2k+l  i-k  +  1     2k,  R  ,      i  v2 

<  2k  exp  i  B2  c  .   o  .    -  2  .- c  .    (  5  ,    -      z  ,    ) 

li  k  i    '    i        '    i1 


+  2k 


f.D„,_      ,2(k+l)           _2k+l  2(k+l)   i-k  +  1/ 
exp^B^c.j  /ei  -   2  c.  - k -(e1 
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kP 

provided  5  .  -   z  .   - ■ — ■ — - 

*  i'i1    i-k  +  1 


>  0 


e±-    l.'l 


"   <L 


kl/ 


i-k  +  1  - 


>  0 


To  complete  the  proof  it  is  sufficient  to  show  that,  for  appropriate 
5   and  e. f   the  limits  i),  iii)>  iv),  v)  and  vi)  listed  in  Section  D 

1  1 

hold  uniformly  in  9 ,   and  that 


ii')   lim  E 

i— >  °o 


1  -  y i  (p  +  f  )6,(^) 


=  0   uniformly  in 


Let; 


(i  -  k  +  l)log  i  (i  -  k  +  1)  log  i  -   '    i1         , 


8i^)  - 


kE 


k  +  1) 


0 


otherwise 


(  Qi(^k} 

(i  -  k  +  l)log   i  ~(i  -  k  +  l)log~i  -    '  zi 


9i 


>  I  <  I    +    I  4  -, 


'!<*>   "  ^ 


ot  nerwise 


k  B* 


(i   -  k  +  1) 


lA 


Then  limits  i),  iii)>  and  iv)  clearly  hold.   Since  E[X]  <—  for 

all  Oett       limit  ii')  also  holds.  VI  >0  there  exist  s(r\) ,    S(n) 

1 


such  that  0  <  s  <  S  <  °°  ,   P[X  <  s]  <  ~     and  P[X  >  S]  <  ^  for  all 
9eQ.   .  Hence  it  remains  only  to  show: 


lim  jz  (y  )  I  log  i  =  0   uniformly  in  9      and  y,   such  thai 
i->  °° 


lim  |z'(Z  )|log  i  -  0 


i->  w 


s  5  y#  5  s    *  =  19  •  ••  *  k 


uniformly  in  0  and  y   such  tha 


s  5  ^i  5  s    and   yk  ^ 


a  -   1 


i  =  1,  ...  ,  k  j  =  k,  k  +  1,  ... 

lim  |q.j(^v)|log  i  =  0   uniformly  in  9     and  j   such  tha 

s  5  y^  5  s  -6  =  1*  •  •  •  >  k  . 


Now 


i-k 


»itek)l  ^rrt^i  I 


j-k+i 


"A   (r(a))k 


i^ 


-V*    0 
)a-le  yi,j   j-k+i 


a-1  -^Gj-k+i| 
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V  J  =  k;    •  -  •    )   i'  -  k 


4 
Let      C,    „  =  1  -  ~ 


■i,j 


then       yi-ci<yJ<yi  +  c1^l-^<-<l+^ 


c. 


■*l«i,il<T 


a-1 


so  that 


r=i 


yj 


yj 


k 


1  -  ^,rx  -  * +  «i 


=i 


where,  for  i  so  large  that  c.  <  s, 


±     —  s 


Let        £, 


I 


k 


j—i- 


& 


then     U2|    <  c   (e^   -   l)      for      |c±|    <  1 


and     e 


"^  (3t-yj)0j-w-i 


- 1  + 1 
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tThus 


i  (^r1  i3**-™  -  j[ 


a-1     ~y/j-k+ii 

y*     e 


ft       a-1     ~yiV^|  ft 

2iy^  e       lk 


n 


a-1 


-^3Tj^J-fcfi 


-   1 


,a-l 


<Sa-x|(l-  ^)(1+  e2)   -  l| 


,a-l 


5  8^(1^1  +  |S  |  +  |^2|) 


so  that     lim   |z.(j  )Jlog  i  =  0     uniformly. 
i-»  w 

We  now  consider     lim  |z'(y  )|log  i.     As  in  the  normal  case  we  shall 
i->  °o 
use   lemma  5 ) ♦ 


We  have: 


;i(y*) i  s  (i  -  k  + 1)  £ 


1-1    A,      j-k+i 


J=k     (r(a))J 


rk-1  ~v*     (9 

-£=1  '  ^ 


-y         0  -yOHHffl    • 

(y**   )  e     k'J    J    -    (y***)a~1  e     k'«J    J 

k>j  k^j 


£C 


a-1  e"yi0j-k+i' 


,  va-1       vyk     i;    J        ,  x 

(yv  +  CJ         e  -   (yv  -  c.) 


a-1 


(•Vci'ejl 


rk       "i- 


'k         i 


< 


2c. 
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Qk(a-l)Aka  i-1 

<_ — s L, —     y, 

'  (i  -  k  +  i)(r(a))k  J=k 


w 


a-1 


"(yJ,ryi)eJ-w-i 


ff i  - 

t/  \f  m       CI™"  _i 

e  ^yk,j  %y  j 

2c± 

r/y>  +  c 


a-1 


-   e 


(yk+c.-^)8j 


2C.0J 


2c 


where:       ^  -  Tk  +  2?^^       y***  =  yk  -  2p2^c. 


(P 


v,J 


2}  5  3IOT  ^  "^j 


4MT(a)eaSc. 


a-1  -  e  yfc 


V  =   1,    2 


We  shall  now  consider  various  parts  of  the  preceding  inequality  for  some 

fixed  j  k  <  j  <  i  -  k.  To  simplify  the  notation  we  let  y   y   . 

kj  J 

c  =  c.   6  =  6    in  what  follows.  We  assume  sufficiently  Large   L. 


')  ^  v  =  i-% 


then  |J   |  <i 
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If]  -1+£1 

where      KJ    <  c2(k-l)(a-l). 

mm 


b)      Let     5     =     I 


V=2 


then     |^2 1    <  c     — g- 


/    ,yy  1  A** J-  *— Pt  ^ 

and     Mu-I  =  (l  +  — =-) 


a-1 


2(a  -   l)p  c 

1  +  i_  +  g 


c)      Let     £_  - 


V=2 


r;1) 


2p2c 


VI 


2  2a+1 
then      1^1    <  c     — tj" 


and 


y  y 


a-l 


2(a  -   l)Poc 

y  3 


d)      Let      £.    =     X 


a  -   ll  |  c 


y ,  "y 


V=2 


2  2a": 


then      |£.  |    <  c     — ^ 
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and     [ZjtS]     '    m  !  +   (a  -  l?c  +  g 
I     y     /  y 


a  -  li  J y 


e)    **    <5  '  vl  ('  v    ) 


i     i  2  28'**"1" 

then     J  £  |   <  c    • — g- 

s 


a-1 
and 


c-V 


VI 


|Z^£)   "     =  x  ,   (a  -  l)c  +  * 

l     y    /  y  5 

-     [2c(p     +  Pp)0lV 

f)   w   s6-  E V^— 

V=2 
then     U6I   <  c2(e^   -  l) 

-(y***_y**)e         2c(p  +p  )e 
and     e  =  e  =  1  +  2c(Pl  +  p,,)©   +  ^ 


;)     Let    c7  -    J 


«     [(2Pl  -   l)c0]V 


v=l  vS 


then     |S J   <  c(eP    -   l) 

and     ew        "      '     =  e  -  1  +  £_,    . 


00        .  N  y 

h)      Let     Co  =     F     I200) 


V-2 


8o 


then     \U   <  c2(e&  -  l) 


and     e2c6  =  1  +  2c0  +  L 


i)     Let     £n  = 


v=i 


-k-1  -i 


vi 


then     K9|    <c(e(k"1^   -   l) 


and 


i=l 


Thus 


a-1 


a  1  yi  I 


xJ      "v  y   ! 

2c 


-  e 


-(y+c-y**)e 


-  i^r  e- 

2c 

(1+  ^)(1+ c9) 


-  (i  +  i7) 


2(a-l)pc  2(a-l)pc 

i+ — ±-  +  !;2-(i — ^  +  g(i  +  2c(p1  +  p2)e+£6) 


2c 


[ltia-^£t^.  [i .  i^iis  +  jj(i  +  2cB  +  5b)] 


2c 


31 


=  |(p  +  p  -  l)(a  "■  1  -  0)  +  £|   where   |£|  <  cM*  for  some  finite 

constant  M*  independent  of  y,  } 
0,   or  j. 


<  I  Pi  -  2I  1-7"  " 9" + 


li  1a  -  1 


2   2 


-  el  +  IS 


<  cM**     for  some  finite  constant  M*  independent  of  y  ,      9_f   or  j 


Hence  lim  |z'(v  )[log  i  =  0  uniformly  in  _£  arid  v,   for  the  set  of 
i-»  °° 

v   of  interest o 

It  now  remains  only  to  show     lim  |  q.  (y,  )  |log  1  =  0     uniformly  for 


i->  °° 


±wk> 


v       such  that     s  <  y .  <  S     £  =  1,    Q.»,k.      But 


Kkk>l*TTTn  £  IJ^-fig 


"'^sUji     a-1     -^ej-w|1. 


7*        e 


TTaJ 


-y 


k  J 


■          a- J.           Q  a- J.  Q 

y,    +  c,|            -c.S.  /  y    -   c.i  c  ,6 , 

II                i  «J  I    k         1  1  j 

e  -     e 


*k 


y, 


2c 


(a  -   1)    _ 


rk 


So  it   is  sufficient  to  show  that     s  <  y  <  S     deQ. 


y  +  c    -o0   /y  -  c    c0 
y  1       "  I  y  I       .  a  -  1 


2c 


+  0 


cM 


where  M  is  some  finite  constant  independent  of  9,     y}   or  c.  Using 
parts  d),  e)  and  slight  modifications  of  h)   in  the  proof  of  the  previous 
limit  we  have : 
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JLp)   e^-(^-g)   e°*   a.1 


2c 


-i^-y^^j('i-  c6  ^  - 


y 


E5|(i+«e  ♦  «§> 


2c 


<  cM  as  desired >   This  completes  the  proof  that  the  decision  pro- 
le th 
cedure  (£     is  uniformly  asymptotically  optimal  of  k    order. 

We  note_,  as  in  the  previous  sections,  that  if  the  problem  is 

modified  to  allow  r   independent  observations  for  each  Q.}    then  the 

sum  of  these  observations  may  be  used  to  obtain  an  asymptotically 

th 
optimal  decision  procedure  of  k    order . 


We  now  consider  the  case  in  which  the  other  parameter  is  unknown, 
ft 

1  -Xx 


X 


that  is  Pq(x)  ■-   rT.'\  x~  "e  '    for  known  X  =   It  may  be  shown  that  if 


1  +  X  + 


?t%)   ■  < 


i  +  »•*, 


otherwise 


V 


and  all  other  definitions  in  this  section  are  unchanged,  then  the 
resulting  c£       is  uniformly  asymptotically  optimal  of  k    order, 
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F.   The  non- parametric  case. 

We  now  consider  the  following  problem.   Let  3  =  {p_(  ):  9e  ft  } 
be  a  class  of  probability  mass  functions,  each  of  which  assigns  proba- 
bility one  to  a  specified  denumerable  class  %   =  {xj   of  real  numbers. 
fl   is  an  arbitrary  index  set,   We  assume  for  each  6     in  ft  that  Va(°) 
is  completely  specified.   Let  h(  • )   be  a  real  valued  function  on  2f. 
Let  \(9)   =   E[h(X)|0].  We  assume  E[h2(X)  |  0]  <  B  <  «>  for  all  9     in 

fie   For  some  unknown  9  .e   ft   we  observe  r  +  1  independent  identically 

J 

distributed  random  variables  X .  ., .  . ,.  ,  X.   , .,   with  P[X„   =  xj 

j,l>        j,r+l  j,s 

=  P-,  (x)   s  =  1,  . . .  ,  r  +  1  xef£.  We  wish  to  estimate  X(0.)   on  the 

basis  of  these  observations.  For  example,  if  h(x)  -  x  then  we  are 

estimating  E[X|  9]  .   If  cp.   is  our  estimate  we  suffer  a  loss  of 

J 

2 
(cp.  -  \(0.))  .  We  now  assume  we  are  faced  with  a  sequence  of  such 
J      J 

decisions.   In  other  words  a  sequence  [9.i      j  =  1,  2,  . . .  )   is 

J 

00 

selected  from  Q.    .      For  each  9.     we  have  r  +  1  observations  and  we 

J 

may  use  X.   to  estimate  9.,   where  X.   is  the  j  X  (r+l)  matrix  of 
— <J  J        —  J 

observations   (X   ).   Johns  [3]  has  considered  this  problem  under  the 

assumption  that  each  9.      is  an  independent  observation  of  an  Q, -valued 

random  variable  6  with  unknown  a  priori  probability  measure  G 

defined  over  a  suitable  cr-algebra  of  subsets  of  ft  .  We  shall  consider 

the  case  in  which  the  sequence  {9.}      is  arbitrarily  chosen, 

J 

As  in  the  previous  cases  we  need  a  standard  to  use  in  evaluating 
a  particular  decision  procedure.  For  any  9  e   ft   we  form  the  k 
order  empirical  probability  measure  G   such  that  for  any  sets 
ft  ,  ...  ,   ft    in  the  a-algebra, 

Qk 


k  1  - 

G   {Q  .  ,    ...    ,  Q. .)    =  — : — ■ — -    [number  of      i      (k  <  i  <  n)      such  that 

n     1-  k         n  -  k  +  1  —      — 

0  •      4-ok  ^  o      ■#  =  1,    •  •  •    ,   k] .      If  we  assume  for      i  <  k  <  m     that 

1  —  K  ■  Z  Xj  —    — 

{6.,  i  =  1,  . . .  ,  m]   is  a  sequence  of  random  variables  taking  values 

k  k 

in  Cl      with  6   naving  any  k- dimensional  probability  measure  G  ,  and 

6   is  independent  of  6n .  .  .  >    ,8  .  •  then  the  Bayes  estimate  for 
m  1-        m~k' 

M'6  )   is  ETUe  )  |xkl   and  the  Bayes  risk  is 

m        '   m  -m  J 

R  ^n(Gk)  =  E{[X(6  )  -  E[X('8  )|Xk]]2},  where  Xk  is  the  k  x  (r+l) 
r+l  m        m  -m    '         =m  v 

matrix  consisting  of  the  last  k  rows  of  X   and  the  subscript  r+l 

=m 

in  the  Bayes  risk  refers  to  the  number  of  observations  for  each  parameter 

value.   We  now  take  as  our  standard  R.   (0)=R(G)«  and  seek  a 

k,r  — n  r     n 

procedure     cp       such  that 


F)      lim  -i  sup 

ol     ft 


n-»  oo 


1     £    _P,    1 


■i     I    E[(cp.(X.)    -   X(0,)n    -  R.       (0   ) 

n    .^  i  =i  i  k,r  —  n 


i=l 


<  0 


We  observe  that  R.   ,-,(©)   is  not  a  desirable  standard  since  if  3   is 
k,r+l  — n 

the  class  of  binomial  densities,  for  example,  then,  as  mentioned  earlier; 

R,   ,-,(0   )      could  not  be  achieved. 
k,r+l  — n 

We  observe  that  theorems  1)  and  2)  are  still  valid  in  this  case 

when  R  (9   )   is  substituted  for  R  (9   )      and  property  F)  for  the 
K,r  ■  n  K  "~"n 

th 
property  of  uniformly  asymptotically  optimal  of  k    order. 

We  define : 


A 


1,1  '•'  '  Al,r 

k 


\X.    -.  »    0  O  0   «   x. 

\  k,l         K,r 


where  x    is  an  arbitrary  real 


number. 
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k  k  k\ 

A/  \   q  =  1,  2,    ...  ,    iji(A  )   to  be  the  m(A  )  distinct  matrices 

obtained  from  A   by  independently  permuting  the  elements 

k       •  k 
within  each  row.   Clearly  1  <  m(A  )  <  (rl)  . 


Xj-k+l,l   Xj-k+l,2 


X 


J-k+l,r^ 


xk 

=3,r 


Xj-k+2,l   Xj-k+2,2 


X 


j-k+2,r 


X 


JA 


X3,2 


J>r 


M.(Ak) 
J 


1  if  there  exists  q  =  1,  ...  ,   m(A  )   such  that 


X,  „  ■-■■  A^    j     j  =  k,  k  +  1,  ... 


0  otherwise 


Z  (Ak)  =  M  (Ak)h(X     )   j  =  k,  k  +  1, 

J        J       J  > ITJ- 
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p^) 


X  M  (AK) 


.k 


m(A )(i  -  k  +  1) 


?i(Ak) 


I    Z.(AK) 


in 


(AK)(i  -  k  +  1) 


P±(Ak)  k 

-£-  if  P±(AK)  >  0 


P^) 


Pf(Ak)  =< 


i  0       elsewhere 


^ii)  ={ 


r+1 

(  tti  S  h(xi  B)  ^  *  -  ^  —  >  k  -  1 

s=l     ' 

-  Bl/2  if  PtfX?  )    <  -   B1'2 

i  ~i>r  — 

Pjrf  )  if  -Bl/2  <  P*(X?   )  <  Q1' 

1x=1,t'  r  =  i,r' 

i  =  k,  k  +  1, 

if   b1'2  <  p^x3;    ) 

1   1  —  n  ^  i? 


B 


1/2 


k     k   k 
We  shall  prove  cp  =  ( cp  _,  q>  ,  ...  )   has  property  F)  provided  the 

following  condition  on  Is   is  satisfied.  V  €  0  <  e  <  1 
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n        r  i     Js     _r 
lim  -     Y.     p 
n^oo11  i=k        Lj^k  /=!  gix     yj_k+i 


i  n  n  pfl    (^.v+i.j  <  i 


i-k+i,s' 


=  0  uniformly  in 


where  X.    .    has  probability  mass  function  p      (•)•   Such  a 
1-k+/'B  9i-k+i 

condition  is  satisfied,  for  example,  if  0e  ft  xe£=>pfl(x)  >  i"|(x)  >  0, 
since  in  this  case 


r    i       k 

z  n 

j=l  i=l  s=l       j-k+i 


P-  (Xi-k+i}   '     i 


.e 


Z 

'i 


Z 

"kr 


I      11      11     P0  (x(i-l)r+s)-' 

x.e9T  x.    e£       Lj=i  i=l  s=l       j-k+i      {        ' 


ft   fi 


i=l  s=l       i-k+i 


9,   _/X(i-l)r+s} 


<     I 


x   e£ 


Z    i 


J. 

II    l(*r«.iW.)> 


1 1 


(i-i)r+s"    ,l-€  , 

e£       Li=l  s=l  i       J 


A      U     p0  (x(^l)r+s) 

i=l  s=l       i-k+i     l        ; 


where,  as  before,  l(a,  b)  =  ( 


1  if  a  <  b 


0  otherwise  . 


But  V  8  >  0  there  exists  a  set  %    a  %     such  that  £-   has  only  a 
finite  number  N^  of  elements  and  P[xe£o]  >  1  -  S.   Hence: 
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Z    •••    I    i 

X.,  e%  x,  e£ 

1        kr 


ft   fi 


i(xf»_-,wJ<  -r 


i=.s=1   <wW  ±i-f 


.  Vw^W 


n   n(x(,.1)r+s),  -i_    +  i  -  (i  -  8) 

x  e9fg     x,  e..g   Li=l  s=l  i   J 


kr 


<  1  -  (l  -  5)    for  i  >  i   where  - — ~—  <  min  h(x)] 
-                    -  o        /  .  vl-e      «. 

( i  )       xe£R 
o  o 


Since  5  was  arbitrary  the  result  quickly  follows.   This  is  not  the 
only  case,  however,  in  which  the  condition  is  satisfied,  as  was  seen  in 

Sections  A  and  C. 

k  v 

The  proof  that  (£       has  property  F)  follows  the  same  general  lines 

as  in  our  other  examples .  V  i  =  k,  k  +  1,  ... 


*"  ^-tttt-i  Z  11   11  %      <*,,.) 

j=k  i=l  s=l       j-k+i 


Qi(A^    =   i   -   k  +  1     £      S    h(x)p0    (x) 

j=k  xe£ 


j        i=i  s=i      j-k+i      >s 


R.   be  a  set  of  k  x  r  matrices  such  that  P[X.   eR.]  =  1 


=  i,r  i 


and  AkeR.  =>  Q(Ak)  >  0 
i 


89 


Then  for  A  eR.  we  have  that  one  version  of 

1 

Q*(Ak) 

E[X(8   )|XV       =  A   1      is   equal  to     f.      (A   )    =  ~ — r—     when     8       has 
irr    =i,r  i,r  Q.(A  )  m 

probability  measure     G.. 
But 

=[M(XJ  J|x?  r  =  Ak]   =mE       ft     H    P0  (xi   s)      j=k,    ...    ,1-  k 

j  -i,r     -i,r  ^=1     i=1  s=1     yj_k+i     *,b 

and 

.k      *      -* 


=  m(Ak)     [I      ]]     P0  (*j   B)     I     h(x)p     (x) 

i=i  s=i      .i-k+i      '      xea:  ,i 


j   =  k,    .  o .    ,    i  -   k   o 


Hence,  arguing  as  in  Section  A, 


PMP.rf  )  -  Q.rf   )|>5.(Ak)|xk   =  Akl  <  2ke   i 
1  i  i,r     l  i,r      l     -i,r     "  — 


i  -    _  i-k+1  R2 

4-5.   -2  : 5. 

k    i 


provided  5  >  — 


i  -  i  -  k  +  1 


Since  Z .   is  not  bounded  we  are  unable  to  use  lemma  3) •  We  can,  however, 
J 

use  a  simple  Chebyshev  bound,  observing  that  Var[Z.(A  )] 

J 

<  E[h  (X.   , ., )  1  <  B;  and  hence  using  an  argument  similar  to  that  in  the 
—       j,r+l   — 


proof  of  lemma  3)  we  have 
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Pr|P,(X?        -   Q,(X^      )|    >5.(Ak)|x^        =  Ak] 
1-1    i     i,r  ix    i,r''  i        "=i,r 


< 


A 


(l.k+l)|6i._|_J 


provided     5     > 


I  -   i  -  k  +  1 


It  thus  follows,   using  a  modified  version  of    lemma   h) ,   that:    V   A  eR. 


(8.    +  Bl/25j     . 
PI  I  9^)   "  t^l   ^       'k,  '£,r  =  V 


Q±(Ak) 


45        -2-i^62 
<  2ke     1  e  K        1  + 


k2B 


(i-k+l)(61-r^T1) 


provided     5     > 


i  -  i  -  k  +  1 


Thus  we  may  take 


/  8  .    +  B1' 25  . 


Q, 


if     5.    > 


i  -   i  -  k  +  1 


*i=< 


otherwise 


, „        n   i-k+1  R2  _ 

t-,  4Sl  -g  -k-  51    ■          fc2Bd-fc+l)  .f     -      j, 

2ke        e  + 5      11      o      >  — 

[(i  -  k  + 1)5.  -  kr 


k  +  1 


'i-< 


u 


otherwise 
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Let 


m1  A 

li  Mil 


/ 


Q1  log  1 


*i=< 


0 


if     Q,    > 


,iA 


i  -  (1  -  k  +  l)log  1 


otherwise 


Then  clearly  11m-  £  E[£  +  £  |  Q  >  a.]  =  0  uniformly  in  0. 
n->  «>  i=k 

Since  our  assumed  condition  assures  that  lim   —  ]T  P[Q.  <  a.] 

n->  oo  L   i=k 


uniformly  in  9,   theorem  2)  is  satisfied,  and  (£       has 

property  F) . 

We  observe  that  in  this  case,  as  in  the  binomial,  the  choLce  of  which 

information  to  neglect  at  the  i    stage  was  arbitrary.   In  particular 

k 


could  have  been  defined  in  any  one  of  (r  +  l)   ways.  We  thus 


=i,r 

could  have  defined  (r  +  l)   essentially  different  estimators  9.   , 

each  of  which  would  have  the  desired  properties.  As  in  the  binomial  case 

it  may  be  shown  that  cp  =  ± y        cp     is  an  improved  estimate. 

(r  +  if       u=i   ^u 
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We  note  that  if  3   had  been  defined  as  a  class  of  absolutely- 
continuous  distribution  functions,  a  similar  decision  procedure  could 
have  been  derived*  As  in  the  normal  and  gamma  examples,  a  sequence 

{c.}  would  allow  us  to  treat  this  continuous  case  as  we  did  the  discrete 
1 

case,  using  lemma  5)  to  show  the  appropriate  limits  hold. 

G.   The  empirical  Bayes  problem. 

We  now  consider  a  modification  of  our  decision  problem  in  which  the 

sequence  [9  . }   is  not  an  arbitrary  sequence,  but  is  instead  a  sequence 

of  observations  of  random  variables.   If  these  random  variables  are 

independent  and  identically  distributed  then  the  problem  has  been  called 

the  empirical  Bayes  problem.  Many  fine  articles  have  been  written  on 

this  problem  and  the  results  obtained  have  inspired  this  paper.   We 

shall  here,  however,  consider  a  more  general  form  of  the  problem. 

Instead  of  assuming  the  0.      to  be  independent  observations  of  a  random 

variable  6,  we  assume  the  sequence  [B  . }     to  be  a  realization  of  a 

stochastic  process   [6.:   i  =  1,  2,  ...  )  which  is  strictly  stationary 

of  order  k.   In  other  words  for  any  k  positive  integers 

jL ,    i?,  ...  ,  i,   and  any  positive  integer  j  the  k  dimensional 

random  vectors   (8.  ,  8.  ,  ...  ,  8.  )   and   (8.  ,.,8.   .,  ...  .8.   . ) 

l  l  i  +  "i    i  +v         i  ■+■  i 

1    2         k  1  J    2  J         k  J 

are  identically  distributed.   In  particular,  we  suppose  that 

V  i  =  k,  k  +  1,  ...   the  vector  (©._   ,  ...  ,  8.)  has  distribution 

function  G  (y.) .   Thus  if  G  v(  y  )  =     G(y ,)  for  some  G  we  would 
have  the  standard  empirical  Bayes  case. 
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If  G   were  known  and  if  8.   were  distributed  independently  of 
(9  ,  8  ,  ...  ,  6.   )   then  the  standard  Bayes  argument  would  yield 
A  =  E[6. I XV]   as  an  estimate  for  0.      which  minimizes  the  expected  loss 
and  achieves  the  Bayes  risk  R(G  ).   Even  if  6.   were  not  distributed 
independently  of   (8  ,  8  ,  . . .  ,  8 .  ,  )   A   might  still  be  a  "good" 

estimate,  and  the  risk  R(G  )   a  reasonable  risk  to  attain.  We  shall 

th 
show  that  any  procedure  which  is  asymptotically  optimal  of  k    order 

(derived  under  the  assumption  of  an  arbitrary  6)     will  also  achieve 

asymptotically  an  average  risk  less  than  or  equal  to  R(G  ).   To  be  more 

precise,  we  shall  show  the  following: 

Let:  ft  be  a  bounded  interval  of  the  real  line. 


{8.,  i  =  1,  2,  . . .  ]  be  a  strictly  stationary  stochastic  process 

of  order  k. 


G   be  the  joint  distribution  function  of  (8.,,  ...  ,  8  ) 

1      '   n 

n  =  1,  2,    ... 


&   be  the  class  of  all  possible  sequences  of  distribution  functions 
(G  ;  n  =  1,  2,  ...  )   such  that  G   is  the  n  dimensional 

marginal  distribution  obtained  from  G   ,   G   satisfies  the 

n  n 

above  definitions,  and  G   puts  probability  one  on  0.        for 

all  n. 


9h 


R(cp.,  G  )   be  the  risk  of  using  the  estimate  cp.   for  Q.      when 
the  vector  6.   is  distributed  according  to  G  . 
This  risk  depends,  of  course,  on  the  class 
3   =  (p0(-):   Sefl}. 

,  n 

R(cp  ,    Gn)  =-  X  R(9.,  G±)      where   G1   is  the   i  dimensional 
i=l 


marginal  distribution  obtained  from 
Gn. 


We  now  state  and  prove  a  generalization  of  a  theorem  by  Samuel  [11] . 


Theorem  h)     Let  3  =  (pfl(-):  9eQ,    }  be  a  class  of  distribution  functions 
Let  cp   be  a  decision  procedure  which  is  asymptotically  optimal  of 
k    order  for  5  .   Then 


.k  „n. 


lim   R(cp  ,  G  )  <  R(G  ) 

n  -»  °° 


*n  1c 

for  all   {G  }e«^o   If  cp   is  uniformly  asymptotically  optimal  then  the 


above  inequality  becomes 


k  „n^        ^/„k> 


lim      sup    [R(g£,  G  )  -  R(G  ) ] f  <  0 


n  -»  °°  *•  r  „n 


(Gn}e£ 
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Proof 


Let  R(q£,  e±)   =  E[(q£(xJ)  -  e±f].     Then: 


R(cpk,  Gn)  =  ^  J  R(cpk,  G1)  =^  y  E[(cpk(X.)  -  Q.f] 
i=l  1=1 


=  ■=■:  I  E(E[(cpk(x.)  -  e.)2)je.] 


i=l 


1   £   „P„,  k 


n  .".     Yi'  -l 

i=l 


For  the  remainder  of  the  proof  we  shall  let  E  [•]   represent  expectation 

where  V  i  >  k  8.  ,  .,  »  . ..  ,  6,   have  a  priori  distribution  function 

v   —     i-k+1      '   i        ^ 

k 
G  ,  and  E  [  ■  ]   represent  expectation  where  6.    ,  ...  ,  8,   have 

a  priori  distribution  function  G  ,   the  k    order  empirical  distri- 
bution function  generated  by  9   ,    6   ,    . ..  ,  9   .      We  now  let  A(X.)   be 

a  form  of  Ej8.|Xk]   and  \)r(Xk)   be  a  form  of  E0[8.|Xk].   Then  A 

k- 
achieves  the  risk  R(G  )      and  \|r  the  risk  Ru.(£  )»   We  observe  V  £v 


El[(A(A  -  e.)2[e  -  £k]  -  e2[(a(A  -  e/|e.  .  4] 


We  call  this  common  value  L(0,  ).   Then 

— k 
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yen)  =  y(*(4>-V  ] 


<E2i(A(xk)  -ek)  ] 


=  E2(E2((A(Xk)    -   ek)"|_9k]i 


n 

y  l(< 


n  ■■  k  +  1   .^        "  i-k+1' 
i=k 


•  ,e±) 


Hence 


n 


E.  [R  (8   )]    < ^— -     T    EjL(9.    .  .., 

i     k  -rr     -  n  -  k  +  1   .^      1         i-k+1' 

i=k 


.  ,  e.)] 


7     R(Gk)    =   R(Gk) 


n  -  k  +  1    .  . 

i=k 


But   since     cp       is  asymptotically  optimal  we  have 


En   -I    lim    -     f    R(cpk,    8.)    -  R  (8  )}■    CO 

^  n->  oo        i=  1 


and  hence,  since  our  losses  are  hounded, 


lim 
n->co 


i  V    El[R(cpk,  8.)]  -  E1[Rk(6n)] 


i=l 


<  0 


lim   R(j£,  Gn)  -  R(Gk) 
n-»  oo  - 


<  0  . 


The  proof  of  the  second  part  of  the  theorem  follows  immediately. 
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Q.E.D, 


Corollary. 

If  in  addition  to  the  assumptions  of  theorem  k)   we  add  the  condition 

that  V  3   =   k  +  1,  k  +  2,    . . .   0.   is  distributed  independently  of  the 

J 

vector  8.    then  the  two  conclusions  of  the  theorem  may  be  replaced 
— j-k 


by 


lim  R(cpk,  Gn)  =  R(Gk) 


n— >  °° 


and 


k 


lim  R(<£  ,    Gn)  =  R(G  )   uniformly  for   {G  }e& 
n->°° 


respectively, 


Proof: 

To  prove  both  parts  of  the  corollary  it  is  sufficient  to  show 
lim  R(_cpk,  Gn)  >R(Gk),   But  since  6.   is  independent  of  8.   ,   R(Gk) 

£7oo    n  J  ~J"K 

is  the  minimum  risk  that  can  be  attained  by  any  estimate  of  9.,      Hence 

J 

R(cpk,  G1)  >  R(Gk)   for  all  i  >  k.   It  may  be  shown  that 


i  <k^>R(GX)  >  R(Gk)   so  that  R(cpk,  G1)  >  R(GX)  >  R(Gk)   for  all 

i  <  k.   Thus  R(_2  ,  Gn)  >  R(Gk)   for  all  n  so  that 

lim  R(cf)  ,  Gn)  >R(Gk)   as  desired. 
n->°° 

Q.E.D. 
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